İzzet KILIÇ İzzet KILIÇ - 2 months ago 7x
Python Question

How to split dataframe according to intersection point in Python?

I am working on a project which is aiming to show difference between good form and bad form of an exercise. To do this we collected the acceleration data with wrist based accelerometer. enter image description hereThe image above shows 2 set of a fitness execise (bench press). Each set has 10 repetitions. And the image below shows 10 repetitions of 1 set.enter image description hereI have a raw data set which consist of 10 set of an execises. What I want to do is splitting the raw data to 10 parts which will contain the part between 2 black line in the image above so I can analyze the data easily. My supervisor gave me a starting point which is choosing cutpoint in the each set. He said take a cutpoint, find the first interruption time start cutting at 3 sec before that time and count to 10 and finish cutting.

This an idea that I don't know how to apply. At least, if you can tell how to cut a dataframe according to cutpoint I would be greatful.


Well, I found another way to detect periodic parts of my accelerometer data. So, Here is my code:

import numpy as np
from peakdetect import peakdetect
import datetime as dt
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
from matplotlib import style
from pandas import DataFrame as df

def get_periodic(path):
    periodics = []
    data_frame = df.from_csv(path)
    data_frame.columns = ['z', 'y', 'x']
    if path.__contains__('1'):
        if path.__contains__('bench'):
            bench_press_1_week = data_frame.between_time('11:24', '11:52')
            peak_indexes = get_peaks(bench_press_1_week.y, lookahead=3000)
            for i in range(0, len(peak_indexes)):
                time_indexes = bench_press_1_week.index.tolist()
                start_time = time_indexes[0]
                periodic_start = start_time.to_datetime() + dt.timedelta(0, peak_indexes[i] / 100)
                periodic_end = periodic_start + dt.timedelta(0, 60)
                periodic = bench_press_1_week.between_time(periodic_start.time(), periodic_end.time())
    return periodics

def get_peaks(data, lookahead):
    peak_indexes = []
    correlation = np.correlate(data, data, mode='full')
    realcorr = correlation[correlation.size / 2:]
    maxpeaks, minpeaks = peakdetect(realcorr, lookahead=lookahead)
    for i in range(0, len(maxpeaks)):

    return peak_indexes

def show_segment_plot(data, periodic_area, exercise_name):
    gs = gridspec.GridSpec(7, 2)
    ax = plt.subplot(gs[:2, :])
    k = 0
    for i in range(2, 7):
        for j in range(0, 2):
            ax = plt.subplot(gs[i, j])
            title = "{} {}".format(k + 1, ".Set")
            k = k + 1

Firstly, this question gave me another perspective for my problem. The image below shows the raw accelerometer data of bench press with 10 sets. Here it has 3 axis(x,y,z) and it's major axis is y(Blue on the image).enter image description here

I used autocorrelation function for detecting the periodic parts, enter image description here In the image above every peak represents 1 set of execises. With this peak detection algorithm I found each peak's x-axis value,

In[196]: maxpeaks

 [[16204, 32910.14013671875],
  [32281, 28726.95849609375],
  [48515, 24583.898681640625],
  [64436, 22088.130859375],
  [80335, 19582.248291015625],
  [96699, 16436.567626953125],
  [113081, 12100.027587890625],
  [129027, 8098.98486328125],
  [145184, 5387.788818359375]]

Basically, each x-value represent samples. My sampling frequency was 100Hz so 16204/100 = 162,04 seconds. To find the time of periodic part I added 162,04 sec to started time. Each bench press took aproximatelly 1 min and in this example, exercise's starting time was 11:24, for first periodic part's start time is 11:26 and ending time is 1 min after.enter image description here There is some lag but yes best solution that I found is this.