master program in computer science with specialization in data science
TRANSCRIPT
Master Degree in Computer Science with specialization in Data Science
Ukrainian Catholic University, Lviv
Prerequisites
● Knowledge of the calculus (function, differentiation, integration, series), basics of linear algebra (vectors, matrices, linear equation systems)
● Satisfactory knowledge of C/C++ or Java, or Python, or C#, Object Oriented programming
● Basic data structures: arrays, trees, lists, stack, queue
● Basic knowledge of relational databases, SQL
● Discrete math: sets, relations, boolean algebra, graphs, basic algorithms on graphs
● Good to know: Statistical background (distributions, Bayes theorem), Basic proficiency in R, Matlab (Octave)
Program duration
● 15 months
● 3 semesters with 7 study sessions
● Study session – 3 days (Thu, Fri, Sat) every other week
● Study day – 10 study hours
Two streamsGraduation skills
● Computer Science– A graduate should fit
Google requirements for interview
● Data Science– The program was built
on the basis of “Data Science Metro Map”
Computer Science Graduate Skills
● Coding: C++ or Java, C and Python, “...Object Orientated Design and Programming, how to test your code...”
● Algorithms: bottom-up and the top-down Algorithms, Sorting (plus searching and binary search), Divide-and-Conquer, Dynamic Programming / Memorization, Greediness, Recursion or algorithms linked to a specific data structure, A*, Dijkstra
● Data structures: Arrays, Linked Lists, Stacks, Queues, Hash-sets, Hash-maps, Hash-tables, Dictionary, Trees and Binary Trees, Heaps and Graph
● Mathematics
Computer Science Graduate Skills (cont.)
● Graphs: algorithms for distance, search, connectivity, cycle-detection, the basic graph traversal algorithms, breadth-first search and depth-first search etc.
● Operating systems: processes, threads, concurrency issues, locks, mutexes, semaphores, monitors
● System design: features sets, interfaces, class hierarchies, distributed systems, designing a system under certain constraints, simplicity, limitations, robustness and tradeoffs
Data Science Graduate Skills
● http://nirvacana.com/thoughts/becoming-a-data-scientist/
– Fundamentals
– Statistics
– Programming
– Machine Learning
– Text Mining / Natural Language Processing
– Data Visualization
– Big Data
– Data Ingestion
– Data Munging
– Toolbox
Curriculum (with timeline)
Computer Science Courses
Computer Science 1
● Advanced Programming, 16 lectures
– Messaging concept, Method lookup & dispatch, Principle of Least Knowledge
– Type systems, Design patterns, Language design
– Testing, Software quality, Refactoring
Computer Science 1
● Algorithms and Data Structures, 16 lectures
– Algorithm complexity theory, sorting algorithms (quicksort, mergesort, heapsort),
– union-find algorithm;
– priority queues; binary search trees; red-black trees; hash tables;
– graph-processing algorithms (minimum spanning tree, shortest paths algorithms),
– greedy algorithms, dynamic programming
Computer Science 1
● Advanced Database Systems, 12 lectures
– New data types (unstructured, textual), Parallel Databases,
– noSQL, MongoDB, Spark, Streaming Systems,
– Memory Data management, Temporal and spatial databases,
– Distributed databases, Heterogeneous databases and data integration
– MapReduce, Hadoop, HBase, HIVE, Association Rules
Computer Science 2
● Parallel Computing, 16 lectures– Implicit vs. explicit parallelism,
– Shared vs. non-shared memory (locks, race conditions, deadlock),
– Synchronization mechanisms, Parallel programming models,
– communications and interconnection networks, multicore caching and memory systems,
– messaging, multicore processor design
– Functional Programming
Computer Science 2
● Advanced Algorithms, 12 lectures– Distributed algorithms: Matrix Factorization, Large
Graph analysis,
– Streaming and online algorithms
– Optimization algorithms: search states, metaheursitics, genetic algorithms,
– Simulated annealing, tabu search, Monte Carlo
Computer Science 2
● Software Architecture, 12 lectures
– The architecture influence cycle, quality attributes, architecture design using patterns and tactics, documenting and evaluating software architecture, architecture reuse, architecture in Agile projects
Computer Science 2
● Software Optimization, 8 lectures
– Basic compiler optimizations, Data‐flow analysis, Optimization,
– Scheduling, Dynamic compilation, Pointer alias analysis, Parallelism/Locality
Product Development
● Product Life Cycle / Product Management / System Analysis and Design, 12 lectures
● Managing Innovations / Entrepreneurship / Startup Strategies, 8 lectures
Product Development
● Law in IT, 8 lectures
– Trade marks and international trade, Patents Copyright law,
– License various types
– Introduction to cyberspace and cyberlaw, IP Protection for software,
– Copyright in cyberspace, Content Liability,
– Trade marks, the Internet & domain names,
– Cybercrime, Online privacy
Data Science Courses
Mathematical Foundations
● Introduction to Data Science, 4 lectures
– Give a general intro to the Data Science problem domain and topics: what is machine learning, learning problem, supervised, unsupervised, regression,generalization and overfitting, intro to time series
Mathematical Foundations
● Linear algebra, 8 lectures– Algorithms for eigenvalue and eigenvector computations
– Efficiency and stability of algorithm
– Matrix factorizations
– Solving linear systems and least squares problems
● Numerical optimization, 8 lectures– Unconstrained optimization: optimality conditions, methods -
steepest descent,
– conjugate gradient, quasi-newton
– Linear optimization: solving LPs graphically, simplex method, sensitivity
– Linear mixed integer programming: branch-and-bound,
– Elements of constrained optimization
Mathematical Foundations
● Applied Statistics and Probabilistic Analysis, 16 lectures– Statistical inference, decision theory, point and interval estimation,
hypothesis testing, ANOVA,
– Neyman-Pearson theory, maximum likelihood,
– Bayesian analysis, large sample theory
– Simple linear regression, Multiple regression, Polynomial Regression,
– Analysis of Variance: Fixed Effects, Nonlinear Regression, Generalized Linear Models,
– Time Series Regression: Correlated Errors
Data Science 1
● Machine Learning, 20 lectures– The Learning Problem, supervised vs. unsupervised
learning,
– Feasibility, Training vs Testing,
– Theory of Generalization, overfitting, validation,
– Linear models, linear regression, logistic regression,
– neural networks, support vector machines, kernel methods,
– Clustering, Bayesian and regularized regression, Naive Bayes Classifier
Data Science 1
● Getting and Cleaning Data, 12 lectures
– Acquisition and cleaning of multisource data sets, types of data sources and databases, web scraping and APIs, text parsing and regular expressions
– Dimensionality reduction, normalization, feature extraction, denoising, sampling, principle component analysis, feature selection
Data Science 1
● Data Visualization, 8 lectures
– Visualization Infrastructure (graphics programming and human perception),
– Multidimensional Data Visualization
– Basic Visualization: charts, graphs, animation, interactivity, hierarchies, networks
– Visualization toolkits: ggplot2, d3.js, Tableau
– Exploratory data analysis-Visual analytics
Data Science 2
● Data Science Problems, 4 lectures – Brief introduction to the different data science
domains
● Introduction to Deep Learning, 8 lectures– Introduction to the main concepts of the Deep
Learning paradigm.
– Description of the general approaches in DL
Data Science 2
● Mining massive datasets, 16 lectures– Introduction to BigData,
– Large scale supervised machine learning
– Link Analysis, PageRank, Distance Measures, Nearest Neighbors,
– Mining data streams, Analysis of Large Graphs, Clustering, MapReduce Algorithms
Data Science 2
● Application courses, 2 courses x 16 lectures– Pick any two from the list
– DS Applications in Business Intelligence and Finance
– Computer Vision
– Natural Language Processing
– Bioinformatics
– Recommendation systems
– DS Applications in Medicine
– Network Analysis
– DS for Smart Cities (Energy, Transportation, etc.)
– Reinforcement Learning
– …...
Self Development ModuleSoft skills
● One lecture per session - “Meet the leader”● Reflexio program
Contacts
For more information:
Oleksii MolchanovskyiAcademic Program Manager at CS@UCU