School on Nonlinear Time Series Analysis and Complex Networks in the Big Data Era

February 19 – March 2, 2018

São Paulo, Brazil

ICTP-SAIFR/IFT-UNESP

logo.png (952×87)

Home

This two-week school will provide participants (mainly PhD students and postdocs) a broad overview of the state-of-the-art in the field of Big Data analysis tools, including the most recent advances in complex networks and methods for the analysis of large time series and datasets, focusing on nonlinear dynamics and network science.

Topics to be covered include:

  • Time delay embedding and phase space reconstruction
  • Tools for chaotic systems (Lyapunov exponents, fractal dimensions)
  • Symbolic encoding techniques
  • Information-theory measures (block entropy, permutation entropy, mutual information, conditional mutual information)
  • Complexity measures
  • Extreme value analysis
  • Structure of Networks
  • Mapping time series to networks (recurrence networks, visibility graphs)
  • From Big data to Networks: Compression sampling, dynamical modal decomposition

Introductory lecturers will be followed by hands-on sessions where the students will have the opportunity to work in small-groups, on real-world datasets. In these sessions the participants will gain practical experience in applying the nonlinear “big-data” tools to the observed output signals of complex systems. This school is part of the topics in Nonlinear Science: Fundamentals and Applications.  There is no registration fee and limited funds are available for travel and local expenses for participants from academic or research institutions.

Organizers:

  • Hilda A. Cerdeira (IFT-UNESP, Brazil)
  • Jesús Gomez-Gardenes (University of Zaragoza, Spain)
  • Cristina Masoller (Universitat Politecnica de Catalunya-Terrassa, Spain) 

Satisfaction survey:

Lecturers

Lecturers:

  • Alex Arenas (Universitat Rovira i Virgili, Spain): Big Data Analysis

1. Introduction: Big data scenario.

2. Data gathering: The problem of big data gathering.

3. Data storage: How to storage and access big data.

4. Data preprocessing: How to pre-process big data.

5. Exploration data analysis. How to make exploratori data analysis.

6. Data to models.  How to model with data.

  • Murilo Baptista (University of Aberdeen, UK): A fast-track glimpse to the life cycle of modelling complex systems

The pathway to model a complex system from available data starts with the collection and pre-processing of the data, for its further characterization and modelling. In these series of  lecturers and hands-on activities, I will provide the audience with a package of tools and mathematical approaches to do big-data science in complex systems. I will start with an example of how to collect gravity time-series around the globe, and how to do standard preprocessing transformations in the data to prepare it for its further characterization and basic physical modelling. In addition to standard characterization tools designed to the analysis of data with strong periodic forcing (e.g. as the power spectrum analysis) or with sensibility to the initial condition (e.g. Lyapunov exponents), I will teach the audience informational-theoretical tools specially designed to the evaluation of how the available experimental time-series (or data) are causally connected. These tools are based on the mutual information and the recently defined causal mutual information, quantities that can be conveniently calculated by symbolic means and therefore suitable for the treatment of experimental time-series. Such an analysis allows for the understanding of the dynamics and the routing of information flows causing effects in the variables of a complex system. This information can be further exploited to the extraction of the underlying direction of the physical connectivity of the variables forming the complex system, a network representation of how variables interact, information also vital for the its modelling. Finally, I will introduce a general modelling approach to conservative physical systems in terms of its physical flows. The lectures will be divided in three topics: (i) Doing big-data science in geophysics: filtering techniques and the estimation of the largest Lyapunov exponent applied to gravity time-series; (ii) Determining the dynamics and the routing of information in complex systems: inference of connectivity and directionality of units in complex systems from time-series; (iii) Flow network models and the determination of hidden flows in complex systems.

  • Ernesto Estrada (University of Strathclyde, UK): Network analytics: Traditional vs. modern approaches.

I will introduce several problems in the traditional network analysis. They are: degree distributions, degree-degree correlation, clustering coefficients, assortativity and shortest path communication. I will discuss the problems that emerge when we try to use these traditional approaches due to the constraints of the data available, the lack of interpretation of the existing indices or wrong initial assumptions about the hypothesis behind the methods. For each case I will show alternative, modern methods, which are based on rigorous mathematical analysis using algebraic, topological and combinatorial methods. All the cases are based on real-world examples and I will make emphasis in the understanding of the methods proposed more than in their technicalities. The students do not require any previous knowledge of network theory and only undergraduate level of mathematics is needed.

  • Jesús Gomez-Gardenes (University of Zaragoza, Spain):

    Metapopulation dynamics: Linking Human mobility and contagion processes. 

In this lecture we will combine two fields: theoretical epidemiology and mobility datasets. This will allow us to construct theoretical models capturing the back-and-forth movements (such as daily commutes) and the elementary contagion processes at work. We will introduce these framework from the basic compartmental models to the more elaborated metapopulation ones, showing the importance of characterizing urban and regional mobility patterns to understand the spread of an epidemic. Finally, we will particularize on the study of vector-borne diseases (such as Dengue, Chingunya and Zika) in urban systems, showing the reliability of these kind of approaches.

Syllabus: 

1) Compartmental models

2) Networks and epidemics

3) Metapopulation approaches & recurrent mobility patterns

5) Mobility detriments spreading

6) Applications to Vector-borne diseases

  • Marta Gonzalez (MIT, U.S.A.): Computational Urban Science

We review methods to analyze human dynamics, and their interactions with the built and the natural environment. We cover methods for modeling and analyzing individual daily activities and travels both at urban and at global scale. Methods include principal component analysis to identify the structure inherent in daily behavior, spatial clustering. Models and methods to represent various socio technical systems as networks, such as: daily commuting, air travels, and roads.

  • Cristina Masoller (Universitat Politecnica de Catalunya-Terrassa, Spain): Introduction to nonlinear time series analysis tools

In these lecturers I will review several approaches for the analysis of observed output signals from complex systems. First, classical linear methods of time-series analysis (Fourier analysis and linear correlations) and the underlying attarctor reconstruction will be reviewed [1]. Then, methods based on complex network representation of time-series will be presented (symbolic networks and horizontal visibility graph [2]). These methods can be used for classification of dynamical regimes and for the identification of regime transitions [3]. The predictability of extreme values (outliers) will also be discussed [4]. I will conclude by presenting the Hilbert transform, which provides, for a real oscillatory time series, x(t), an instantaneous amplitude, a(t), and an instantaneous frequency, w(t), for each data point of the time series [5]. I will discuss how a(t) and w(t) also allow to identify regime transitions and to classify dynamical regimes. All these time-series diagnostic tools will be presented in relation to practical applications to experimentally recorded laser signals, observed atmospheric data and biomedical (ocular) images.

References

[1] H. Kantz and T. Schreiber, Nonlinear Time Series Analysis (Cambridge University Press, 2004).

[2] C. Masoller et al, “Quantifying sudden changes in dynamical systems using symbolic networks”, New Journal of Physics 17, 023068 (2015).

[3] C. Quintero-Quiroz, et al, “Characterizing how complex optical signals emerge from noisy intensity fluctuations”, Sci. Rep. 6 37510 (2016).

[4] N. Martinez Alvarez, S. Borkar and C. Masoller, “Predictability of extreme intensity pulses in optically injected semiconductor lasers”, Eur. Phys. J. Spec. Top. 226, 1971 (2017).

[5] D. A. Zappala, M. Barreiro, and C. Masoller, “Global atmospheric dynamics investigated by using Hilbert frequency analysis”, Entropy 18, UNSP 408 (2016).

  • Osvaldo Rosso (Universidade Federal de Alagoas, Brazil & CONICET, Argentina): Time series characterization by information theory based quantifiers

Objectives: This short course will provide participants a broad overview on the new tools based on Information Theory for the time series datasets characterization and identification of its dynamical behavior

Syllabus:

1)      Time series and Information Theory. Chaotic dynamics as information sources.

2)      Information Theory quantifiers: Shannon entropy and Fisher Information for continuous and discrete PDFs.

3)      Information Theory quantifiers: Statistical Complexity. Simple and complex. Cristal and ideal gas. Meaning of Complexity. Statistical Complexity, C=HxQ. Disorder H. Disequilibrium Q. Maximum and minimum of Generalized Statistical Complexity. Application to logistic map.

4)      Time series & how to associate a PDF.

5)      PDF – frequency counting. Shakespeare and other English Renaissance authors.

6)      PDF – histogram and amplitudes. The logistic map.

7)      PDF – frequency (Fourier Transform) and frequency bands (Wavelet Transform) representation. EEG tonic-clonic epileptic records.

8)      PDF – ordinal patterns (Bandt-Pompe methodology). Chaos, noise and 1/fk noise. Logistic map and white noise. Chaotic dynamics plus additive noise.

9)      The Amigó paradigm: forbidden/missing patters.

10)  PDF – Horizontal Visibility Graph. Distinguishing chaos from noise. The lambda rule. PDF-HVG and Shannon-Fisher plane.

11)  Causal Fisher Information. Shannon-Fisher plane.

Applications: Stochastic resonance. Econophysics. Neuronal activity. Pseudo Random Generators. Electric load and vehicle behavior. Classical-quantum transition. El Niño/Southern Oscillation. Lasers dynamics.Handwritten signatures.

 

Poster

School Program

School Program: PDF version (updated on Feb. 20, 2018)

Click on the name of the lectures to watch the videos.

Extra link from Prof. Marta Gonzalez
Go here with the phone: https://emission.eecs.berkeley.edu/#/client_setup?new_client=martanetworksp18&clear_usercache=true&clear_local_storage=true  and link  the link of the second study.

Extra code from Prof. Murilo Baptista: here. Please refer to lecture 4 (Thursday, March 1)

Student presentations

Leonardo dos Reis Leano Soares, Elisabeth Mateus Yoshimura (Brasil) (PDF)
Application of non-linear least-square Method (LSM) for non-linear parameter estimation

Federico Albanese, P. Balenzuela and V. Semeshenko (Argentina) (PDF)
The mass media bias: analysing and comparing the time series of polls and News articles during the 2016 USA presidential election

Amelia Almeida (Brazil)
Meta-regression for predicting body weight and carcass yields in beef cattle under different supplementation strategies

Gustavo Henrique Tomanik and A. S. L. O. Campanharo (Brazil) (PDF)
Predicting epileptic seizures in EEG recordings with the use of complex networks

Juliano Faria (Brazil) (PDF)
A prediction model for hospital readmission based on neural networks

David Soriano-Paños, J. Gómez-Gardeñes, A. Arenas (Spain)
Epidemic detriment driven by recurrent human mobility patterns

Vitor Hugo Louzada Patrico 
Big Science and Applications in Banks

Photos

School on NTSA and Complex Networks in the Big Data Era

Additional Information

List of participants:Updated on 27 Feb
Registration: ALL participants should register. The registration will be on February 19 at the institute at 8:30 am. You can find arrival instructions at http://www.ictp-saifr.org/?page_id=195

Accommodation: Participants, whose accommodation has been provided by the institute will stay at The Universe Flat. Each participant, whose accommodation has been provided by the institute, has received the accommodation details individually by email.

BOARDING PASS: All participants, whose travel has been provided or will be reimbursed by the institute, should bring the boarding pass upon registration, and collect an envelope to send the return boarding pass to the institute.

Emergency number: 9 7070 8603 (from São Paulo city); +55 11 9 7070 8603 (from abroad), 11 9 7070 8603 (from outside São Paulo).

Ground transportation instructions: 

Transportation from The Universe Flat to the institute

Transportation from Guarulhos Airport to The Universe Flat

Transportation from The Universe Flat to the institute