About
This Python3 code aids in analyzing raw measurements with an Acoustic Doppler Velocimeter (ADV) producing, for example,``*.vno`` and *.vna
files (should also work with other file types, though not yet tested). It detects and removes spikes according to Nikora and Goring (1998) and Goring and Nikora (2002).
The code was originally developed in Matlab(R) at the Nepf Environmental Fluid Mechanics Laboratory (Massachusetts Institute of Technology).
Important
Data (e.g. *.vno
and *.vna
) files need to comply with the following name convention:
XX_YY_ZZ_something.ENDING
where XX
, YY
, and ZZ
are streamwise (x), perpendicular (y), and vertical (z) coordinates in CENTIMETERS, respectively. Anything else added after ZZ_
is ignored by the code (it just copies it for the sake of dataset naming).
Note
This documentation is also as available as style-adapted PDF.
Requirements & Installation
Time requirement: 5-10 min.
Install Requirements
To get the code running, the following software is needed and their installation instructions are provided below:
Python >=3.6
NumPy >=1.17.4
Openpyxl 3.0.3
Pandas >=1.3.5
Matplotlib >=3.1.2
Start with downloading and installing the latest version of Anaconda Python. Alternatively, downloading and installing a pure Python interpreter will also work. Detailed information about installing Python is available in the Anaconda Docs and at hydro-informatics.com/python-basics.
To install the NumPy, Openpyxl, Pandas, and Matplotlib libraries after installing Anaconda, open Anaconda Prompt (e.g., click on the Windows icon, tap anaconda prompt
, and hit enter``). In Anaconda Prompt, enter the following command sequence to install the libraries in the base environment. The installation may take a while depending on your internet speed.
conda install -c anaconda numpy
conda install -c anaconda openpyxl
conda install -c anaconda numpy
conda install -c conda-forge pandas
conda install -c conda-forge matplotlib
If you are struggling with the dark window and blinking cursor of Anaconda Prompt, worry not. You can also use Anaconda Navigator and install the four libraries (in the above order) in Anaconda Navigator.
Note
Alternatively, create a new conda environment to install the three libraries for this application. However, creating a new environment may eat up a lot of disk space, and installing the Python-omnipresent libraries NumPy, Openpyxl, Pandas, and Maplotlib in the base environment does not hurt.
Install TKEanalyst
Still in Anaconda Prompt (or any other Python-pip-able Terminal), enter:
pip install TKEanalyst
The last item you need to run TKEanalyst
is the workbook (.xlsx
) template for defining input parameters (download input.xlsx).
Usage
Regular Usage
TKEanalyst requires meta data (i.e. data about your data) defined in an input workbook. Therefore, download input.xlsx) and save it on your computer. Next, with Python installed and the code living on your computer:
Save your data in a folder and make sure the files are named with
XX_YY_ZZ_something.FILEENDING
whereXX
,YY
, andZZ
are streamwise (x), perpendicular (y), and vertical (z) coordinates in CENTIMETERS, respectively.FILEENDING
could be, for example,.vna
.Complete the required information on the experimental setup in
input.xlsx
(see figure below). IMPORTANT: Never modify column A or any list in the sourcetables sheet (unless you also modifyload_input_defs
in line 25ff ofprofile_analyst.py
). The code uses the text provided in these areas of input.xlsx to identify setups. If useful, consider substituting the Wood wording in your mind and with a note in column C with your characteristic turbulence objects, but do not modify column A. Ultimately, you can also save the input file under a different name and call the code with a different input file name.

The interface of the input.xlsx workbook for entering experiment parameters and specifying a despiking method.
Implement the following code in a Python script and run that Python script:
import TKEanalyst
input_file = r"C:\\my\\project\\adv\\input.xlsx"
TKEanalyst.process_adv_files(input_file)
- Alternatively:
run the code:
python profile_analyst.py "C:/dir/to/input.xlsx
)
Wait until the code finished with
-- DONE -- ALL TASKS FINISHED --
- After a successful run, the code will have produced the following files in
...\your-data\
: .xlsx
files of full-time series data, with spikes and despiked..xlsx
files of statistic summaries (i.e., average, standard deviation std, TKE) of velocity parameters with x, y, and z positions, with spikes and despiked (see workbook example in the figure below).Two plots (
norm-tke-x.png
andnorm-tke-x-despiked.png
) showing normalized TKE plotted against normalized x, with spikes and despiked, respectively (see plot example in the figure below).
- After a successful run, the code will have produced the following files in


Usage Example
For example, consider your data lives in a folder called C:\my-project\TKEanalysis\test01
. To analyze *.vna
files in test01
save the following code to a Python script named tke_analysis.py
along with definitions in an input.xlsx
workbook :
import TKEanalyst
input_file = r"C:\\my-project\\TKEanalysis\\test01\\input.xlsx"
TKEanalyst.process_adv_files(input_file)
The definitions in the above-shown input.xlsx
define x-normalization as a function of a wood log length, for example, a wood log diameter of 0.114 m.
Cell B2
containing Input folder directory in input.xlsx
defines that the input data for test01
.
Important
The data directory of the subfolder definition in cell B2
may not end on any \
or /
. Also, make sure to use the /
sign for folder name separation (do not use \
).
- To run the code with the example data, open Anaconda Prompt (or any other Python-able Terminal) and:
cd
into the code directory (e.g.,cd "C:\my-project\TKEanalysis\test01"
run the code:
python tke_analysis.py
wait until the code finished with
-- DONE -- ALL TASKS FINISHED --
- After a successful run, the code will have produced the following files in
C:\my-project\TKEanalysis\test01
: .xlsx
files of full-time series data, with spikes and despiked..xlsx
files of statistic summaries (i.e., average, standard deviation std, TKE) of velocity parameters with x, y, and z positions, with spikes and despiked.Two plots (
norm-tke-x.png
andnorm-tke-x-despiked.png
) showing normalized TKE plotted against normalized x, with spikes and despiked, respectively.
- After a successful run, the code will have produced the following files in
Developer Docs
The following sections provide details of functions, their arguments, and outputs to help tweaking the code for individual purposes.
config.py
Global parameters settings (essentially PROFILE KEYS) and message logging controls.
flowstat.py
- TKEanalyst.flowstat.flowstat(time, u, v, w1, w2, profile_type='lp')[source]
Calculate ADV data statistics
- Parameters
time (np.array) – time in seconds
u (np.array) – streamweise velocity along x-axis (positive in bulk flow direction)
v (np.array) – perpendicular velocity along y-axis
w1 (np.array) – vertical velocity if side is DOWN
w2 (np.array) – vertical velocity if side is not DOWN
profile_type (str) – orientation of the probe (default: lp, which mean probe looks like FlowTracker in a river)
- Returns
keys correspond to series names and values to full time series stats (dict(dict)): keys correspond to series names with STAT for autoreplacement with STAT type of nested dictionaries with AVRG, STD and STDERR
- Return type
time_series (dict)
profile_analyst.py
Load ADV measurements and calculate TKE with plot options Originally coded in Matlab at Nepf Lab (MIT) Re-written in Python by Sebastian Schwindt (2022)
- TKEanalyst.profile_analyst.build_stats_summary(vna_stats_dict, experiment_info, profile_type, bulk_velocity, log_length)[source]
Re-organize the stats dataset and assign probe coordinates
- Parameters
vna_stats_dict (dict) – the result of all vna files processed with the flowstat.flowstat function
experiment_info (dict) – the result of the get_data_info function for retrieving probe positions
profile_type (str) – profile orientation as a function of sensor position; the default is lp corresponding to DOWN (ignores w2 measurements)
bulk_velocity (float) – bulk streamwise flow velocity in m/s (from input.xlsx)
log_length (float) – characteristic log length (either diameter or length) in m (from input.xlsx)
- Returns
Organized overview pandas.DataFrame with measurement stats, ready for dumping to workbook
- TKEanalyst.profile_analyst.get_data_info(file_ending, folder_name='data/test-example')[source]
get names of input file names and prepare output matrix according to number of files
- TKEanalyst.profile_analyst.load_input_defs(file_name='input.xlsx')[source]
loads provided input file name as pandas dataframe
- TKEanalyst.profile_analyst.read_vna(vna_file_name)[source]
Read vna file name as pandas dataframe.
- Parameters
vna_file_name (str) – name of a vna file, such as __8_16.5_6_T3.vna
- Returns
_pd.DataFrame
profile_plotter.py
Plot functions for TKE visualization
Note
The script represents merely a start for plotting normalized TKE against normalized X. If required, enrich this script with more plot functions and integrate them in profile_analyst.process_vna_files at the bottom of the function.
rmspike.py
- TKEanalyst.rmspike.rmspike(vna_df, u_stats, v_stats, w_stats, w2_stats=None, method='velocity', freq=200.0, lambda_a=1.0, k=3.0, profile_type='lp')[source]
Spike removal and replacement - see Nikora & Goring (1999) and Goring & Nikora (2002).
- Parameters
vna_df (pandas.DataFrame) – matrix-like data array of the vna measurement file
u_stats (pandas.DataFrame) – streamwise velocity stats from flowstat function
v_stats (pandas.DataFrame) – perpendicular velocity stats from flowstat function
w_stats (pandas.DataFrame) – vertical velocity stats from flowstat function
w2_stats (pandas.DataFrame) – sec. vertical velocity stats from flowstat function (only required if profile_type is not lp)
method (str) – determines whether to use acceleration or velocity (default) for despiking
freq (int) – sampling frequency in 1/s (Hz); default is 200 Hz
lambda_a (float) – multiplier of gravitational acceleration (acceleration threshold)
k (float) – multiplier of velocity stdev (velocity threshold)
side (str) – orientation of the probe (default: DOWN, which mean probe looks like FlowTracker in a river)
Note
Goring & Nikora (2002) suggest lambda_a = 1.0 ~ 1.5 and k = 1.5, but we shall use lambda_a = 1.0 and k = 3 ~ 9. SonTek, Nortek, and Lei recommend the SNR and correlation thresholds to be 15 and 70 respectively. Though data points have high SNR, the correlation can be low.
Disclaimer and License
Disclaimer (general)
No warranty is expressed or implied regarding the usefulness or completeness of the information provided for tke-analyst and its documentation. References to commercial products do not imply endorsement by the Author of tke-analyst. The concepts, materials, and methods used in the codes and described in the docs are for informational purposes only. The Author have made substantial effort to ensure the accuracy of the code and the docs and the Author shall not be held liable, nor their employers or funding sponsors, for calculations and/or decisions made on the basis of application of tke-analyst. The information is provided “as is” and anyone who chooses to use the information is responsible for her or his own choices as to what to do with the code, docs, and data and the individual is responsible for the results that follow from their decisions.
BSD 3-Clause License
Copyright (c) 2022, the Author. All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.