Pimlapat commited on
Commit
0da1b45
·
verified ·
1 Parent(s): 30cfd0d

Create app.py

Browse files
Files changed (1) hide show
  1. app.py +703 -0
app.py ADDED
@@ -0,0 +1,703 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ LONG_ARTICLE = """"for about 20 years the problem of properties of
2
+ short - term changes of solar activity has been
3
+ considered extensively . many investigators
4
+ studied the short - term periodicities of the
5
+ various indices of solar activity . several
6
+ periodicities were detected , but the
7
+ periodicities about 155 days and from the interval
8
+ of @xmath3 $ ] days ( @xmath4 $ ] years ) are
9
+ mentioned most often . first of them was
10
+ discovered by @xcite in the occurence rate of
11
+ gamma - ray flares detected by the gamma - ray
12
+ spectrometer aboard the _ solar maximum mission (
13
+ smm ) . this periodicity was confirmed for other
14
+ solar flares data and for the same time period
15
+ @xcite . it was also found in proton flares during
16
+ solar cycles 19 and 20 @xcite , but it was not
17
+ found in the solar flares data during solar cycles
18
+ 22 @xcite . _ several autors confirmed above
19
+ results for the daily sunspot area data . @xcite
20
+ studied the sunspot data from 18741984 . she found
21
+ the 155-day periodicity in data records from 31
22
+ years . this periodicity is always characteristic
23
+ for one of the solar hemispheres ( the southern
24
+ hemisphere for cycles 1215 and the northern
25
+ hemisphere for cycles 1621 ) . moreover , it is
26
+ only present during epochs of maximum activity (
27
+ in episodes of 13 years ) .
28
+ similarinvestigationswerecarriedoutby + @xcite .
29
+ they applied the same power spectrum method as
30
+ lean , but the daily sunspot area data ( cycles
31
+ 1221 ) were divided into 10 shorter time series .
32
+ the periodicities were searched for the frequency
33
+ interval 57115 nhz ( 100200 days ) and for each of
34
+ 10 time series . the authors showed that the
35
+ periodicity between 150160 days is statistically
36
+ significant during all cycles from 16 to 21 . the
37
+ considered peaks were remained unaltered after
38
+ removing the 11-year cycle and applying the power
39
+ spectrum analysis . @xcite used the wavelet
40
+ technique for the daily sunspot areas between 1874
41
+ and 1993 . they determined the epochs of
42
+ appearance of this periodicity and concluded that
43
+ it presents around the maximum activity period in
44
+ cycles 16 to 21 . moreover , the power of this
45
+ periodicity started growing at cycle 19 ,
46
+ decreased in cycles 20 and 21 and disappered after
47
+ cycle 21 . similaranalyseswerepresentedby + @xcite
48
+ , but for sunspot number , solar wind plasma ,
49
+ interplanetary magnetic field and geomagnetic
50
+ activity index @xmath5 . during 1964 - 2000 the
51
+ sunspot number wavelet power of periods less than
52
+ one year shows a cyclic evolution with the phase
53
+ of the solar cycle.the 154-day period is prominent
54
+ and its strenth is stronger around the 1982 - 1984
55
+ interval in almost all solar wind parameters . the
56
+ existence of the 156-day periodicity in sunspot
57
+ data were confirmed by @xcite . they considered
58
+ the possible relation between the 475-day (
59
+ 1.3-year ) and 156-day periodicities . the 475-day
60
+ ( 1.3-year ) periodicity was also detected in
61
+ variations of the interplanetary magnetic field ,
62
+ geomagnetic activity helioseismic data and in the
63
+ solar wind speed @xcite . @xcite concluded that
64
+ the region of larger wavelet power shifts from
65
+ 475-day ( 1.3-year ) period to 620-day ( 1.7-year
66
+ ) period and then back to 475-day ( 1.3-year ) .
67
+ the periodicities from the interval @xmath6 $ ]
68
+ days ( @xmath4 $ ] years ) have been considered
69
+ from 1968 . @xcite mentioned a 16.3-month (
70
+ 490-day ) periodicity in the sunspot numbers and
71
+ in the geomagnetic data . @xcite analysed the
72
+ occurrence rate of major flares during solar
73
+ cycles 19 . they found a 18-month ( 540-day )
74
+ periodicity in flare rate of the norhern
75
+ hemisphere . @xcite confirmed this result for the
76
+ @xmath7 flare data for solar cycles 20 and 21 and
77
+ found a peak in the power spectra near 510540 days
78
+ . @xcite found a 17-month ( 510-day ) periodicity
79
+ of sunspot groups and their areas from 1969 to
80
+ 1986 . these authors concluded that the length of
81
+ this period is variable and the reason of this
82
+ periodicity is still not understood . @xcite and +
83
+ @xcite obtained statistically significant peaks of
84
+ power at around 158 days for daily sunspot data
85
+ from 1923 - 1933 ( cycle 16 ) . in this paper the
86
+ problem of the existence of this periodicity for
87
+ sunspot data from cycle 16 is considered . the
88
+ daily sunspot areas , the mean sunspot areas per
89
+ carrington rotation , the monthly sunspot numbers
90
+ and their fluctuations , which are obtained after
91
+ removing the 11-year cycle are analysed . in
92
+ section 2 the properties of the power spectrum
93
+ methods are described . in section 3 a new
94
+ approach to the problem of aliases in the power
95
+ spectrum analysis is presented . in section 4
96
+ numerical results of the new method of the
97
+ diagnosis of an echo - effect for sunspot area
98
+ data are discussed . in section 5 the problem of
99
+ the existence of the periodicity of about 155 days
100
+ during the maximum activity period for sunspot
101
+ data from the whole solar disk and from each solar
102
+ hemisphere separately is considered . to find
103
+ periodicities in a given time series the power
104
+ spectrum analysis is applied . in this paper two
105
+ methods are used : the fast fourier transformation
106
+ algorithm with the hamming window function ( fft )
107
+ and the blackman - tukey ( bt ) power spectrum
108
+ method @xcite . the bt method is used for the
109
+ diagnosis of the reasons of the existence of peaks
110
+ , which are obtained by the fft method . the bt
111
+ method consists in the smoothing of a cosine
112
+ transform of an autocorrelation function using a
113
+ 3-point weighting average . such an estimator is
114
+ consistent and unbiased . moreover , the peaks are
115
+ uncorrelated and their sum is a variance of a
116
+ considered time series . the main disadvantage of
117
+ this method is a weak resolution of the
118
+ periodogram points , particularly for low
119
+ frequences . for example , if the autocorrelation
120
+ function is evaluated for @xmath8 , then the
121
+ distribution points in the time domain are :
122
+ @xmath9 thus , it is obvious that this method
123
+ should not be used for detecting low frequency
124
+ periodicities with a fairly good resolution .
125
+ however , because of an application of the
126
+ autocorrelation function , the bt method can be
127
+ used to verify a reality of peaks which are
128
+ computed using a method giving the better
129
+ resolution ( for example the fft method ) . it is
130
+ valuable to remember that the power spectrum
131
+ methods should be applied very carefully . the
132
+ difficulties in the interpretation of significant
133
+ peaks could be caused by at least four effects : a
134
+ sampling of a continuos function , an echo -
135
+ effect , a contribution of long - term
136
+ periodicities and a random noise . first effect
137
+ exists because periodicities , which are shorter
138
+ than the sampling interval , may mix with longer
139
+ periodicities . in result , this effect can be
140
+ reduced by an decrease of the sampling interval
141
+ between observations . the echo - effect occurs
142
+ when there is a latent harmonic of frequency
143
+ @xmath10 in the time series , giving a spectral
144
+ peak at @xmath10 , and also periodic terms of
145
+ frequency @xmath11 etc . this may be detected by
146
+ the autocorrelation function for time series with
147
+ a large variance . time series often contain long
148
+ - term periodicities , that influence short - term
149
+ peaks . they could rise periodogram s peaks at
150
+ lower frequencies . however , it is also easy to
151
+ notice the influence of the long - term
152
+ periodicities on short - term peaks in the graphs
153
+ of the autocorrelation functions . this effect is
154
+ observed for the time series of solar activity
155
+ indexes which are limited by the 11-year cycle .
156
+ to find statistically significant periodicities it
157
+ is reasonable to use the autocorrelation function
158
+ and the power spectrum method with a high
159
+ resolution . in the case of a stationary time
160
+ series they give similar results . moreover , for
161
+ a stationary time series with the mean zero the
162
+ fourier transform is equivalent to the cosine
163
+ transform of an autocorrelation function @xcite .
164
+ thus , after a comparison of a periodogram with an
165
+ appropriate autocorrelation function one can
166
+ detect peaks which are in the graph of the first
167
+ function and do not exist in the graph of the
168
+ second function . the reasons of their existence
169
+ could be explained by the long - term
170
+ periodicities and the echo - effect . below method
171
+ enables one to detect these effects . ( solid line
172
+ ) and the 95% confidence level basing on thered
173
+ noise ( dotted line ) . the periodogram values are
174
+ presented on the left axis . the lower curve
175
+ illustrates the autocorrelation function of the
176
+ same time series ( solid line ) . the dotted lines
177
+ represent two standard errors of the
178
+ autocorrelation function . the dashed horizontal
179
+ line shows the zero level . the autocorrelation
180
+ values are shown in the right axis . ] because
181
+ the statistical tests indicate that the time
182
+ series is a white noise the confidence level is
183
+ not marked . ] . ] the method of the diagnosis
184
+ of an echo - effect in the power spectrum ( de )
185
+ consists in an analysis of a periodogram of a
186
+ given time series computed using the bt method .
187
+ the bt method bases on the cosine transform of the
188
+ autocorrelation function which creates peaks which
189
+ are in the periodogram , but not in the
190
+ autocorrelation function . the de method is used
191
+ for peaks which are computed by the fft method (
192
+ with high resolution ) and are statistically
193
+ significant . the time series of sunspot activity
194
+ indexes with the spacing interval one rotation or
195
+ one month contain a markov - type persistence ,
196
+ which means a tendency for the successive values
197
+ of the time series to remember their antecendent
198
+ values . thus , i use a confidence level basing on
199
+ the red noise of markov @xcite for the choice of
200
+ the significant peaks of the periodogram computed
201
+ by the fft method . when a time series does not
202
+ contain the markov - type persistence i apply the
203
+ fisher test and the kolmogorov - smirnov test at
204
+ the significance level @xmath12 @xcite to verify a
205
+ statistically significance of periodograms peaks .
206
+ the fisher test checks the null hypothesis that
207
+ the time series is white noise agains the
208
+ alternative hypothesis that the time series
209
+ contains an added deterministic periodic component
210
+ of unspecified frequency . because the fisher test
211
+ tends to be severe in rejecting peaks as
212
+ insignificant the kolmogorov - smirnov test is
213
+ also used . the de method analyses raw estimators
214
+ of the power spectrum . they are given as follows
215
+ @xmath13 for @xmath14 + where @xmath15 for
216
+ @xmath16 + @xmath17 is the length of the time
217
+ series @xmath18 and @xmath19 is the mean value .
218
+ the first term of the estimator @xmath20 is
219
+ constant . the second term takes two values (
220
+ depending on odd or even @xmath21 ) which are not
221
+ significant because @xmath22 for large m. thus ,
222
+ the third term of ( 1 ) should be analysed .
223
+ looking for intervals of @xmath23 for which
224
+ @xmath24 has the same sign and different signs one
225
+ can find such parts of the function @xmath25 which
226
+ create the value @xmath20 . let the set of values
227
+ of the independent variable of the autocorrelation
228
+ function be called @xmath26 and it can be divided
229
+ into the sums of disjoint sets : @xmath27 where +
230
+ @xmath28 + @xmath29 @xmath30 @xmath31 + @xmath32 +
231
+ @xmath33 @xmath34 @xmath35 @xmath36 @xmath37
232
+ @xmath38 @xmath39 @xmath40 well , the set
233
+ @xmath41 contains all integer values of @xmath23
234
+ from the interval of @xmath42 for which the
235
+ autocorrelation function and the cosinus function
236
+ with the period @xmath43 $ ] are positive . the
237
+ index @xmath44 indicates successive parts of the
238
+ cosinus function for which the cosinuses of
239
+ successive values of @xmath23 have the same sign .
240
+ however , sometimes the set @xmath41 can be empty
241
+ . for example , for @xmath45 and @xmath46 the set
242
+ @xmath47 should contain all @xmath48 $ ] for which
243
+ @xmath49 and @xmath50 , but for such values of
244
+ @xmath23 the values of @xmath51 are negative .
245
+ thus , the set @xmath47 is empty . . the
246
+ periodogram values are presented on the left axis
247
+ . the lower curve illustrates the autocorrelation
248
+ function of the same time series . the
249
+ autocorrelation values are shown in the right axis
250
+ . ] let us take into consideration all sets
251
+ \{@xmath52 } , \{@xmath53 } and \{@xmath41 } which
252
+ are not empty . because numberings and power of
253
+ these sets depend on the form of the
254
+ autocorrelation function of the given time series
255
+ , it is impossible to establish them arbitrary .
256
+ thus , the sets of appropriate indexes of the sets
257
+ \{@xmath52 } , \{@xmath53 } and \{@xmath41 } are
258
+ called @xmath54 , @xmath55 and @xmath56
259
+ respectively . for example the set @xmath56
260
+ contains all @xmath44 from the set @xmath57 for
261
+ which the sets @xmath41 are not empty . to
262
+ separate quantitatively in the estimator @xmath20
263
+ the positive contributions which are originated by
264
+ the cases described by the formula ( 5 ) from the
265
+ cases which are described by the formula ( 3 ) the
266
+ following indexes are introduced : @xmath58
267
+ @xmath59 @xmath60 @xmath61 where @xmath62 @xmath63
268
+ @xmath64 taking for the empty sets \{@xmath53 }
269
+ and \{@xmath41 } the indices @xmath65 and @xmath66
270
+ equal zero . the index @xmath65 describes a
271
+ percentage of the contribution of the case when
272
+ @xmath25 and @xmath51 are positive to the positive
273
+ part of the third term of the sum ( 1 ) . the
274
+ index @xmath66 describes a similar contribution ,
275
+ but for the case when the both @xmath25 and
276
+ @xmath51 are simultaneously negative . thanks to
277
+ these one can decide which the positive or the
278
+ negative values of the autocorrelation function
279
+ have a larger contribution to the positive values
280
+ of the estimator @xmath20 . when the difference
281
+ @xmath67 is positive , the statement the
282
+ @xmath21-th peak really exists can not be rejected
283
+ . thus , the following formula should be satisfied
284
+ : @xmath68 because the @xmath21-th peak could
285
+ exist as a result of the echo - effect , it is
286
+ necessary to verify the second condition :
287
+ @xmath69\in c_m.\ ] ] . the periodogram values
288
+ are presented on the left axis . the lower curve
289
+ illustrates the autocorrelation function of the
290
+ same time series ( solid line ) . the dotted lines
291
+ represent two standard errors of the
292
+ autocorrelation function . the dashed horizontal
293
+ line shows the zero level . the autocorrelation
294
+ values are shown in the right axis . ] to
295
+ verify the implication ( 8) firstly it is
296
+ necessary to evaluate the sets @xmath41 for
297
+ @xmath70 of the values of @xmath23 for which the
298
+ autocorrelation function and the cosine function
299
+ with the period @xmath71 $ ] are positive and the
300
+ sets @xmath72 of values of @xmath23 for which the
301
+ autocorrelation function and the cosine function
302
+ with the period @xmath43 $ ] are negative .
303
+ secondly , a percentage of the contribution of the
304
+ sum of products of positive values of @xmath25 and
305
+ @xmath51 to the sum of positive products of the
306
+ values of @xmath25 and @xmath51 should be
307
+ evaluated . as a result the indexes @xmath65 for
308
+ each set @xmath41 where @xmath44 is the index from
309
+ the set @xmath56 are obtained . thirdly , from all
310
+ sets @xmath41 such that @xmath70 the set @xmath73
311
+ for which the index @xmath65 is the greatest
312
+ should be chosen . the implication ( 8) is true
313
+ when the set @xmath73 includes the considered
314
+ period @xmath43 $ ] . this means that the greatest
315
+ contribution of positive values of the
316
+ autocorrelation function and positive cosines with
317
+ the period @xmath43 $ ] to the periodogram value
318
+ @xmath20 is caused by the sum of positive products
319
+ of @xmath74 for each @xmath75-\frac{m}{2k},[\frac{
320
+ 2m}{k}]+\frac{m}{2k})$ ] . when the implication
321
+ ( 8) is false , the peak @xmath20 is mainly
322
+ created by the sum of positive products of
323
+ @xmath74 for each @xmath76-\frac{m}{2k},\big [
324
+ \frac{2m}{n}\big ] + \frac{m}{2k } \big ) $ ] ,
325
+ where @xmath77 is a multiple or a divisor of
326
+ @xmath21 . it is necessary to add , that the de
327
+ method should be applied to the periodograms peaks
328
+ , which probably exist because of the echo -
329
+ effect . it enables one to find such parts of the
330
+ autocorrelation function , which have the
331
+ significant contribution to the considered peak .
332
+ the fact , that the conditions ( 7 ) and ( 8) are
333
+ satisfied , can unambiguously decide about the
334
+ existence of the considered periodicity in the
335
+ given time series , but if at least one of them is
336
+ not satisfied , one can doubt about the existence
337
+ of the considered periodicity . thus , in such
338
+ cases the sentence the peak can not be treated as
339
+ true should be used . using the de method it is
340
+ necessary to remember about the power of the set
341
+ @xmath78 . if @xmath79 is too large , errors of an
342
+ autocorrelation function estimation appear . they
343
+ are caused by the finite length of the given time
344
+ series and as a result additional peaks of the
345
+ periodogram occur . if @xmath79 is too small ,
346
+ there are less peaks because of a low resolution
347
+ of the periodogram . in applications @xmath80 is
348
+ used . in order to evaluate the value @xmath79 the
349
+ fft method is used . the periodograms computed by
350
+ the bt and the fft method are compared . the
351
+ conformity of them enables one to obtain the value
352
+ @xmath79 . . the fft periodogram values are
353
+ presented on the left axis . the lower curve
354
+ illustrates the bt periodogram of the same time
355
+ series ( solid line and large black circles ) .
356
+ the bt periodogram values are shown in the right
357
+ axis . ] in this paper the sunspot activity data (
358
+ august 1923 - october 1933 ) provided by the
359
+ greenwich photoheliographic results ( gpr ) are
360
+ analysed . firstly , i consider the monthly
361
+ sunspot number data . to eliminate the 11-year
362
+ trend from these data , the consecutively smoothed
363
+ monthly sunspot number @xmath81 is subtracted from
364
+ the monthly sunspot number @xmath82 where the
365
+ consecutive mean @xmath83 is given by @xmath84 the
366
+ values @xmath83 for @xmath85 and @xmath86 are
367
+ calculated using additional data from last six
368
+ months of cycle 15 and first six months of cycle
369
+ 17 . because of the north - south asymmetry of
370
+ various solar indices @xcite , the sunspot
371
+ activity is considered for each solar hemisphere
372
+ separately . analogously to the monthly sunspot
373
+ numbers , the time series of sunspot areas in the
374
+ northern and southern hemispheres with the spacing
375
+ interval @xmath87 rotation are denoted . in order
376
+ to find periodicities , the following time series
377
+ are used : + @xmath88 + @xmath89 + @xmath90
378
+ + in the lower part of figure [ f1 ] the
379
+ autocorrelation function of the time series for
380
+ the northern hemisphere @xmath88 is shown . it is
381
+ easy to notice that the prominent peak falls at 17
382
+ rotations interval ( 459 days ) and @xmath25 for
383
+ @xmath91 $ ] rotations ( [ 81 , 162 ] days ) are
384
+ significantly negative . the periodogram of the
385
+ time series @xmath88 ( see the upper curve in
386
+ figures [ f1 ] ) does not show the significant
387
+ peaks at @xmath92 rotations ( 135 , 162 days ) ,
388
+ but there is the significant peak at @xmath93 (
389
+ 243 days ) . the peaks at @xmath94 are close to
390
+ the peaks of the autocorrelation function . thus ,
391
+ the result obtained for the periodicity at about
392
+ @xmath0 days are contradict to the results
393
+ obtained for the time series of daily sunspot
394
+ areas @xcite . for the southern hemisphere (
395
+ the lower curve in figure [ f2 ] ) @xmath25 for
396
+ @xmath95 $ ] rotations ( [ 54 , 189 ] days ) is
397
+ not positive except @xmath96 ( 135 days ) for
398
+ which @xmath97 is not statistically significant .
399
+ the upper curve in figures [ f2 ] presents the
400
+ periodogram of the time series @xmath89 . this
401
+ time series does not contain a markov - type
402
+ persistence . moreover , the kolmogorov - smirnov
403
+ test and the fisher test do not reject a null
404
+ hypothesis that the time series is a white noise
405
+ only . this means that the time series do not
406
+ contain an added deterministic periodic component
407
+ of unspecified frequency . the autocorrelation
408
+ function of the time series @xmath90 ( the lower
409
+ curve in figure [ f3 ] ) has only one
410
+ statistically significant peak for @xmath98 months
411
+ ( 480 days ) and negative values for @xmath99 $ ]
412
+ months ( [ 90 , 390 ] days ) . however , the
413
+ periodogram of this time series ( the upper curve
414
+ in figure [ f3 ] ) has two significant peaks the
415
+ first at 15.2 and the second at 5.3 months ( 456 ,
416
+ 159 days ) . thus , the periodogram contains the
417
+ significant peak , although the autocorrelation
418
+ function has the negative value at @xmath100
419
+ months . to explain these problems two
420
+ following time series of daily sunspot areas are
421
+ considered : + @xmath101 + @xmath102 + where
422
+ @xmath103 the values @xmath104 for @xmath105
423
+ and @xmath106 are calculated using additional
424
+ daily data from the solar cycles 15 and 17 .
425
+ and the cosine function for @xmath45 ( the period
426
+ at about 154 days ) . the horizontal line ( dotted
427
+ line ) shows the zero level . the vertical dotted
428
+ lines evaluate the intervals where the sets
429
+ @xmath107 ( for @xmath108 ) are searched . the
430
+ percentage values show the index @xmath65 for each
431
+ @xmath41 for the time series @xmath102 ( in
432
+ parentheses for the time series @xmath101 ) . in
433
+ the right bottom corner the values of @xmath65 for
434
+ the time series @xmath102 , for @xmath109 are
435
+ written . ] ( the 500-day period ) ] the
436
+ comparison of the functions @xmath25 of the time
437
+ series @xmath101 ( the lower curve in figure [ f4
438
+ ] ) and @xmath102 ( the lower curve in figure [ f5
439
+ ] ) suggests that the positive values of the
440
+ function @xmath110 of the time series @xmath101 in
441
+ the interval of @xmath111 $ ] days could be caused
442
+ by the 11-year cycle . this effect is not visible
443
+ in the case of periodograms of the both time
444
+ series computed using the fft method ( see the
445
+ upper curves in figures [ f4 ] and [ f5 ] ) or the
446
+ bt method ( see the lower curve in figure [ f6 ] )
447
+ . moreover , the periodogram of the time series
448
+ @xmath102 has the significant values at @xmath112
449
+ days , but the autocorrelation function is
450
+ negative at these points . @xcite showed that the
451
+ lomb - scargle periodograms for the both time
452
+ series ( see @xcite , figures 7 a - c ) have a
453
+ peak at 158.8 days which stands over the fap level
454
+ by a significant amount . using the de method the
455
+ above discrepancies are obvious . to establish the
456
+ @xmath79 value the periodograms computed by the
457
+ fft and the bt methods are shown in figure [ f6 ]
458
+ ( the upper and the lower curve respectively ) .
459
+ for @xmath46 and for periods less than 166 days
460
+ there is a good comformity of the both
461
+ periodograms ( but for periods greater than 166
462
+ days the points of the bt periodogram are not
463
+ linked because the bt periodogram has much worse
464
+ resolution than the fft periodogram ( no one know
465
+ how to do it ) ) . for @xmath46 and @xmath113 the
466
+ value of @xmath21 is 13 ( @xmath71=153 $ ] ) . the
467
+ inequality ( 7 ) is satisfied because @xmath114 .
468
+ this means that the value of @xmath115 is mainly
469
+ created by positive values of the autocorrelation
470
+ function . the implication ( 8) needs an
471
+ evaluation of the greatest value of the index
472
+ @xmath65 where @xmath70 , but the solar data
473
+ contain the most prominent period for @xmath116
474
+ days because of the solar rotation . thus ,
475
+ although @xmath117 for each @xmath118 , all sets
476
+ @xmath41 ( see ( 5 ) and ( 6 ) ) without the set
477
+ @xmath119 ( see ( 4 ) ) , which contains @xmath120
478
+ $ ] , are considered . this situation is presented
479
+ in figure [ f7 ] . in this figure two curves
480
+ @xmath121 and @xmath122 are plotted . the vertical
481
+ dotted lines evaluate the intervals where the sets
482
+ @xmath107 ( for @xmath123 ) are searched . for
483
+ such @xmath41 two numbers are written : in
484
+ parentheses the value of @xmath65 for the time
485
+ series @xmath101 and above it the value of
486
+ @xmath65 for the time series @xmath102 . to make
487
+ this figure clear the curves are plotted for the
488
+ set @xmath124 only . ( in the right bottom corner
489
+ information about the values of @xmath65 for the
490
+ time series @xmath102 , for @xmath109 are written
491
+ . ) the implication ( 8) is not true , because
492
+ @xmath125 for @xmath126 . therefore ,
493
+ @xmath43=153\notin c_6=[423,500]$ ] . moreover ,
494
+ the autocorrelation function for @xmath127 $ ] is
495
+ negative and the set @xmath128 is empty . thus ,
496
+ @xmath129 . on the basis of these information one
497
+ can state , that the periodogram peak at @xmath130
498
+ days of the time series @xmath102 exists because
499
+ of positive @xmath25 , but for @xmath23 from the
500
+ intervals which do not contain this period .
501
+ looking at the values of @xmath65 of the time
502
+ series @xmath101 , one can notice that they
503
+ decrease when @xmath23 increases until @xmath131 .
504
+ this indicates , that when @xmath23 increases ,
505
+ the contribution of the 11-year cycle to the peaks
506
+ of the periodogram decreases . an increase of the
507
+ value of @xmath65 is for @xmath132 for the both
508
+ time series , although the contribution of the
509
+ 11-year cycle for the time series @xmath101 is
510
+ insignificant . thus , this part of the
511
+ autocorrelation function ( @xmath133 for the time
512
+ series @xmath102 ) influences the @xmath21-th peak
513
+ of the periodogram . this suggests that the
514
+ periodicity at about 155 days is a harmonic of the
515
+ periodicity from the interval of @xmath1 $ ] days
516
+ . ( solid line ) and consecutively smoothed
517
+ sunspot areas of the one rotation time interval
518
+ @xmath134 ( dotted line ) . both indexes are
519
+ presented on the left axis . the lower curve
520
+ illustrates fluctuations of the sunspot areas
521
+ @xmath135 . the dotted and dashed horizontal lines
522
+ represent levels zero and @xmath136 respectively .
523
+ the fluctuations are shown on the right axis . ]
524
+ the described reasoning can be carried out for
525
+ other values of the periodogram . for example ,
526
+ the condition ( 8) is not satisfied for @xmath137
527
+ ( 250 , 222 , 200 days ) . moreover , the
528
+ autocorrelation function at these points is
529
+ negative . these suggest that there are not a true
530
+ periodicity in the interval of [ 200 , 250 ] days
531
+ . it is difficult to decide about the existence of
532
+ the periodicities for @xmath138 ( 333 days ) and
533
+ @xmath139 ( 286 days ) on the basis of above
534
+ analysis . the implication ( 8) is not satisfied
535
+ for @xmath139 and the condition ( 7 ) is not
536
+ satisfied for @xmath138 , although the function
537
+ @xmath25 of the time series @xmath102 is
538
+ significantly positive for @xmath140 . the
539
+ conditions ( 7 ) and ( 8) are satisfied for
540
+ @xmath141 ( figure [ f8 ] ) and @xmath142 .
541
+ therefore , it is possible to exist the
542
+ periodicity from the interval of @xmath1 $ ] days
543
+ . similar results were also obtained by @xcite for
544
+ daily sunspot numbers and daily sunspot areas .
545
+ she considered the means of three periodograms of
546
+ these indexes for data from @xmath143 years and
547
+ found statistically significant peaks from the
548
+ interval of @xmath1 $ ] ( see @xcite , figure 2 )
549
+ . @xcite studied sunspot areas from 1876 - 1999
550
+ and sunspot numbers from 1749 - 2001 with the help
551
+ of the wavelet transform . they pointed out that
552
+ the 154 - 158-day period could be the third
553
+ harmonic of the 1.3-year ( 475-day ) period .
554
+ moreover , the both periods fluctuate considerably
555
+ with time , being stronger during stronger sunspot
556
+ cycles . therefore , the wavelet analysis suggests
557
+ a common origin of the both periodicities . this
558
+ conclusion confirms the de method result which
559
+ indicates that the periodogram peak at @xmath144
560
+ days is an alias of the periodicity from the
561
+ interval of @xmath1 $ ] in order to verify the
562
+ existence of the periodicity at about 155 days i
563
+ consider the following time series : + @xmath145
564
+ + @xmath146 + @xmath147 + the value @xmath134
565
+ is calculated analogously to @xmath83 ( see sect .
566
+ the values @xmath148 and @xmath149 are evaluated
567
+ from the formula ( 9 ) . in the upper part of
568
+ figure [ f9 ] the time series of sunspot areas
569
+ @xmath150 of the one rotation time interval from
570
+ the whole solar disk and the time series of
571
+ consecutively smoothed sunspot areas @xmath151 are
572
+ showed . in the lower part of figure [ f9 ] the
573
+ time series of sunspot area fluctuations @xmath145
574
+ is presented . on the basis of these data the
575
+ maximum activity period of cycle 16 is evaluated .
576
+ it is an interval between two strongest
577
+ fluctuations e.a . @xmath152 $ ] rotations . the
578
+ length of the time interval @xmath153 is 54
579
+ rotations . if the about @xmath0-day ( 6 solar
580
+ rotations ) periodicity existed in this time
581
+ interval and it was characteristic for strong
582
+ fluctuations from this time interval , 10 local
583
+ maxima in the set of @xmath154 would be seen .
584
+ then it should be necessary to find such a value
585
+ of p for which @xmath155 for @xmath156 and the
586
+ number of the local maxima of these values is 10 .
587
+ as it can be seen in the lower part of figure [ f9
588
+ ] this is for the case of @xmath157 ( in this
589
+ figure the dashed horizontal line is the level of
590
+ @xmath158 ) . figure [ f10 ] presents nine time
591
+ distances among the successive fluctuation local
592
+ maxima and the horizontal line represents the
593
+ 6-rotation periodicity . it is immediately
594
+ apparent that the dispersion of these points is 10
595
+ and it is difficult to find even few points which
596
+ oscillate around the value of 6 . such an analysis
597
+ was carried out for smaller and larger @xmath136
598
+ and the results were similar . therefore , the
599
+ fact , that the about @xmath0-day periodicity
600
+ exists in the time series of sunspot area
601
+ fluctuations during the maximum activity period is
602
+ questionable . . the horizontal line represents
603
+ the 6-rotation ( 162-day ) period . ] ] ]
604
+ to verify again the existence of the about
605
+ @xmath0-day periodicity during the maximum
606
+ activity period in each solar hemisphere
607
+ separately , the time series @xmath88 and @xmath89
608
+ were also cut down to the maximum activity period
609
+ ( january 1925december 1930 ) . the comparison of
610
+ the autocorrelation functions of these time series
611
+ with the appriopriate autocorrelation functions of
612
+ the time series @xmath88 and @xmath89 , which are
613
+ computed for the whole 11-year cycle ( the lower
614
+ curves of figures [ f1 ] and [ f2 ] ) , indicates
615
+ that there are not significant differences between
616
+ them especially for @xmath23=5 and 6 rotations (
617
+ 135 and 162 days ) ) . this conclusion is
618
+ confirmed by the analysis of the time series
619
+ @xmath146 for the maximum activity period . the
620
+ autocorrelation function ( the lower curve of
621
+ figure [ f11 ] ) is negative for the interval of [
622
+ 57 , 173 ] days , but the resolution of the
623
+ periodogram is too low to find the significant
624
+ peak at @xmath159 days . the autocorrelation
625
+ function gives the same result as for daily
626
+ sunspot area fluctuations from the whole solar
627
+ disk ( @xmath160 ) ( see also the lower curve of
628
+ figures [ f5 ] ) . in the case of the time series
629
+ @xmath89 @xmath161 is zero for the fluctuations
630
+ from the whole solar cycle and it is almost zero (
631
+ @xmath162 ) for the fluctuations from the maximum
632
+ activity period . the value @xmath163 is negative
633
+ . similarly to the case of the northern hemisphere
634
+ the autocorrelation function and the periodogram
635
+ of southern hemisphere daily sunspot area
636
+ fluctuations from the maximum activity period
637
+ @xmath147 are computed ( see figure [ f12 ] ) .
638
+ the autocorrelation function has the statistically
639
+ significant positive peak in the interval of [ 155
640
+ , 165 ] days , but the periodogram has too low
641
+ resolution to decide about the possible
642
+ periodicities . the correlative analysis indicates
643
+ that there are positive fluctuations with time
644
+ distances about @xmath0 days in the maximum
645
+ activity period . the results of the analyses of
646
+ the time series of sunspot area fluctuations from
647
+ the maximum activity period are contradict with
648
+ the conclusions of @xcite . she uses the power
649
+ spectrum analysis only . the periodogram of daily
650
+ sunspot fluctuations contains peaks , which could
651
+ be harmonics or subharmonics of the true
652
+ periodicities . they could be treated as real
653
+ periodicities . this effect is not visible for
654
+ sunspot data of the one rotation time interval ,
655
+ but averaging could lose true periodicities . this
656
+ is observed for data from the southern hemisphere
657
+ . there is the about @xmath0-day peak in the
658
+ autocorrelation function of daily fluctuations ,
659
+ but the correlation for data of the one rotation
660
+ interval is almost zero or negative at the points
661
+ @xmath164 and 6 rotations . thus , it is
662
+ reasonable to research both time series together
663
+ using the correlative and the power spectrum
664
+ analyses . the following results are obtained :
665
+ 1 . a new method of the detection of statistically
666
+ significant peaks of the periodograms enables one
667
+ to identify aliases in the periodogram . 2 . two
668
+ effects cause the existence of the peak of the
669
+ periodogram of the time series of sunspot area
670
+ fluctuations at about @xmath0 days : the first is
671
+ caused by the 27-day periodicity , which probably
672
+ creates the 162-day periodicity ( it is a
673
+ subharmonic frequency of the 27-day periodicity )
674
+ and the second is caused by statistically
675
+ significant positive values of the autocorrelation
676
+ function from the intervals of @xmath165 $ ] and
677
+ @xmath166 $ ] days . the existence of the
678
+ periodicity of about @xmath0 days of the time
679
+ series of sunspot area fluctuations and sunspot
680
+ area fluctuations from the northern hemisphere
681
+ during the maximum activity period is questionable
682
+ . the autocorrelation analysis of the time series
683
+ of sunspot area fluctuations from the southern
684
+ hemisphere indicates that the periodicity of about
685
+ 155 days exists during the maximum activity period
686
+ . i appreciate valuable comments from professor j.
687
+ jakimiec ."""
688
+
689
+ from transformers import LEDForConditionalGeneration, LEDTokenizer
690
+ import torch
691
+
692
+ tokenizer = LEDTokenizer.from_pretrained("allenai/led-large-16384-arxiv")
693
+
694
+ input_ids = tokenizer(LONG_ARTICLE, return_tensors="pt").input_ids.to("cuda")
695
+ global_attention_mask = torch.zeros_like(input_ids)
696
+ # set global_attention_mask on first token
697
+ global_attention_mask[:, 0] = 1
698
+
699
+ model = LEDForConditionalGeneration.from_pretrained("allenai/led-large-16384-arxiv", return_dict_in_generate=True).to("cuda")
700
+
701
+ sequences = model.generate(input_ids, global_attention_mask=global_attention_mask).sequences
702
+
703
+ summary = tokenizer.batch_decode(sequences)