News

Big Data Analytics course for students, entrepreneurs and business leaders at the University of Montana

M491-50: Theoretical Basics of Big Data Analytics and Near Real Time Computation Algorithms

Course Announcement (University of Montana, Dept. of Mathematical Sciences)

Contact Information: Leonid Kalachev, Professor and Chair ([email protected] ; (406) 243-4373; http://www.umt.edu/academicplanner/coursesearch/search.html )

Instructor: Professor Peter Golubtsov, Moscow State University (Russia).

3 credits; prerequisites: basic math courses on 100 and 200 level.

Time: W 2:10PM – 5:00PM. Course will be delivered in dual format, both face-to-face and online.

Room: Math 306.

Maximum number of students: 25.

All the necessary materials will be distributed in class (and supplied via internet to those taking class online). To register for non-credit option go to: http://umt.edu/ce/extended/noncredit/profdev/bigdata.php

Topics that are going to be addressed (tentative):

Real time learning.

Example: Correlation (or regression) analysis based on calibration.
Regression algorithm, which is used to process a stream of data in real time is permanently updated by itself by another stream of "calibration" data.

Problems:

Amount of calibration data may be huge and rapidly growing. As a result collecting and storing all the calibration data may require lots of storage.

Besides, computing an updated version of regression algorithm based on these data would require a lot (and even constantly growing) amount of time.

Solutions:

When calibration measurement arrives – do not store it, but use it to update specific information (of a fixed size), which is sufficient for computing the regression algorithm. Such "packed" information is analogous to sufficient statistics.

Do not re-compute the regression algorithm "from scratch" but update it using only the new calibration data and stored sufficient information.

In the course: Consider the linear calibration problem, study its computational complexity using "straightforward" and "minimal sufficient" approaches.

Real time signal processing.

Standard approaches would require to record a complete signal (for example sound recording) and then apply to it a processing algorithm. Although such approach can provide the most accurate processing, it cannot be done in real time since it would require to a record full signal (or big chunks of it).

However if time is critical, processing can be performed as the data arrives, using a "sliding window". Such approach would not require storing the source signal and would provide a feasible balance between processing quality, complexity and delay.

In the course: Consider a simple optimal processing problem for a (potentially) infinite array of data and develop an optimal algorithm for a given computational requirements and delay.

Decomposition of a big problem into a set of similar, but smaller problems, which can be solved in parallel. To design an appropriate algorithm that would benefit from such decomposition one would need to define explicitly how to compute the corresponding parts in parallel, or use an appropriate high level programming language (see next section).

In the course: Consider several examples, such as vector and matrix multiplication. Time requirements for "standard" and "parallel" algorithms.

Thinking Big.

Contemporary algorithmic thinking is severely linked to a programming language. At the same time almost all algorithmic languages, which are used for applied problems explicitly determine the order of operations. As a result, even if a computer can handle multiple problems at once a classical algorithmic language would not allow using parallelism implicitly. To overcome that a programmer should explicitly determine which parts of code can be executed in parallel.
A radical approach to that problem is move towards algorithmic languages that do not specify the order of operations (such as, e.g. functional languages ML, Haskell, etc.).

In the course: Short introduction to functional programming. Implementation of matrix algebra and sorting algorithms in a functional language.

Real time processing challenges.

Example. Information system of a modern aircraft:

Thousands of sensors
Continuously providing streams of readings.

It is required in real time to:
Recognize "pathological" patterns of readings’ combinations in dynamics;

Identify malfunctioning device or circuit;

Initiate appropriate compensation mechanisms.

Sorry, we couldn't find any posts. Please try a different search.

Leave a Comment

You must be logged in to post a comment.