I have a problem regarding the correlation of two light curves in my bachelor thesis. I use Scipio.signal.correlate to calculate the correlation. The light curves both have a different amount of data points and have different times. I think the first one has data between 2020 and 2020 and the second one before 2020. So I created data frames with pandas, added both curves into one frame and filled everything that said "NaN" up with zero. I correlated and normalized the curves and added the lags and correlation into one plot that looks like it could be correct. Now I have the problem, that the lags seem to be data lags, not time lags. So they say to me how much data points I have to postpone a curve to best match the other curve, not how much days I have to postpone. I found a calculation but I need the "slab rate" (?) to do this, which is not possible because my data points don't have the same distances. I found same other modules like Stingray but it needs bins at the same size and I think I can't resize because it would be a data loss. My idea was to subtract every point from the point with the biggest lag (would be 312) but then I would have only 534 datapoints while the correlation gives me 1067 (I still don't know why the correlation doubles the points...). I am running out of ideas. I think the following would be the important code block:
def ccf_values(series1, series2): p = series1 q = series2 p = (p - np.mean(p)) / (np.std(p) * len(p)) q = (q - np.mean(q)) / (np.std(q)) c = scipy.signal.correlate(p, q, 'full') return c #fermi_df=merged_df[merged_df['fermi_data']!='nan'] #print(fermi_df) #ztf_lc.set_index('filter', inplace=True) #test = pd.DataFrame({'r-data': [ztf_lc.loc['ZTF_r', 'fluxtot']]}) # ztf_r_frame=ztf_lc[ztf_lc['filter']=='ZTF_r'] #ccf_ielts = ccf_values(fermi_lc['y'], test.iloc[:,9]) ztf_zeit = ztf_r_frame['mjd'] fermi_zeit = fermi_lc.iloc[:,0] ztf_df = pd.DataFrame({'Time': ztf_zeit, 'ztf_data': ztf_r_frame['mjd']}) fermi_df = pd.DataFrame({'Time': fermi_zeit, 'fermi_data': fermi_lc['y']}) merged_df = pd.merge(ztf_df, fermi_df, on='Time', how='outer') merged_df.sort_values(by='Time', inplace=True) merged_df['ztf_data'] = merged_df['ztf_data'].fillna(0) merged_df['fermi_data'] = merged_df['fermi_data'].fillna(0) zeitdifferenzen_df = merged_df['Time'] # Nehmen Sie die Zeitdifferenzen von der Spalte 'time' in sortierter_zeit_df. zeitdifferenzen_df['Time'] = zeitdifferenzen_df - zeitdifferenzen_df.iloc[312] #zeitdifferenzen_df = zeitdifferenzen_df[(zeitdifferenzen_df != 0).all(1)] # Jetzt enthält zeitdifferenzen_df die Zeitdifferenzen von jedem Zeitpunkt zu Zeitpunkt 1. #print(zeitdifferenzen_df) ccf_ielts = ccf_values(merged_df['ztf_data'],merged_df['fermi_data']) #ccf_ielts = ccf_values(merged_df['ztf_data'],merged_df['ztf_data']) lags = signal.correlation_lags(len(merged_df['fermi_data']), len(merged_df['ztf_data'])) #lags = signal.correlation_lags(len(merged_df['ztf_data']), len(merged_df['ztf_data'])) def ccf_plot(lags, ccf): fig, ax = plt.subplots(figsize=(9, 6)) ax.plot(lags, ccf) ax.axhline(-2/np.sqrt(23), color='red', label='5% confidence interval') ax.axhline(2/np.sqrt(23), color='red') ax.axvline(x=0, color='black', lw=1) ax.axhline(y=0, color='black', lw=1) ax.axhline(y=np.max(ccf), color='blue', lw=1, linestyle='--', label='highest +/- correlation') ax.axhline(y=np.min(ccf), color='blue', lw=1, linestyle='--') ax.set(ylim=[-1, 1]) ax.set_title('Cross Correation IElTS Search and Registeration Count', weight='bold', fontsize=15) ax.set_ylabel('Correlation Coefficients', weight='bold', fontsize=12) ax.set_xlabel('Time Lags', weight='bold', fontsize=12) plt.legend() #ccf_plot(zeitdifferenzen_df['time'], ccf_ielts) ccf_plot(zeitdifferenzen_df['Time'], ccf_ielts) merged_df is a data frame with 534 datapoints per row. I don't know if I left out an important info; if so, please let me know.
the plot:

Thank you very much
Источник: https://stackoverflow.com/questions/771 ... -time-lags