Presentazione

Organizzazione della Didattica

DM270
DATA SCIENCE ORD. 2017


12

Corsi comuni

 

Frontali Esercizi Laboratorio Studio Individuale
ORE: 96 0 0 102

Periodo

AnnoPeriodo
I anno1 semestre

Frequenza

Facoltativa

Erogazione

Convenzionale

Lingua

Inglese

Calendario Attività Didattiche

InizioFine
02/10/201719/01/2018

Tipologia

TipologiaAmbitoSSDCFU
caratterizzanteTecnologie dell'informaticaINF/016
caratterizzanteTecnologie dell'informaticaING-INF/056


Responsabile Insegnamento

ResponsabileSSDStruttura
Dott. TOLOMEI GABRIELEINF/01Dipartimento di Matematica

Altri Docenti

DocenteCoperturaSSDStruttura
Dott. BUJARI ARMIRIstituzionaleINF/01Dipartimento di Matematica
Dott. PINELLI FABIOContrattoN.D.

Attività di Supporto alla Didattica

Esercitatore
Dott.ssa DI BONO MARIA GRAZIA

Bollettino

The student should have basic knowledge of programming and algorithms.

This class teaches the concepts, methods, and technologies at the basis of storage, networking, and processing of data and big data. Concerning storage, the basics of relational databases are introduced, followed by a review of non-relational solutions typically adopted for big data. Basics of systems for storage of streams of data are presented as well. The part concerning networking provides an introduction to fundamental concepts in the design and implementation of computer communication networks, their protocols, and applications. Topics covered in this part include: layered network architecture, data link protocols, network and transport protocols and applications. Examples will be drawn from the Internet TCP/IP protocol suite. After that, advanced and emerging networking paradigms aimed at addressing QoS and engineering flexibility of current infrastructure networks are introduced. Topics covered range from software defined networking to cloud provisioning schemes and datacenters. The programming part focuses on programming for data scientists using Python, starting from the description of its interactive computational environment, and continuing with storage, data manipulation, and visualization.

The course consists of lectures.

The course will cover the topics listed below: - Databases Introduction to relational databases: data model; relational algebra; SQL; DBMS; NoSQL technologies: characteristics of NoSQL databases; aggregate data models: key value stores, document databases, column family stores, graph databases, others; distribution models: sharding, replication (master-slave,peer-to-peer). Streams of Data: architecture(s); data modeling; query processing and optimization. Networking Networking Fundamentals: Network architectures (OSI Model); TCP and UDP Transport layer protocols; IP Addressing and Routing; Link Layer Forwarding; DNS and DHCP. Advanced Networking: Virtual LAN (VLAN) and Virtual eXtensible Lan (VXLAN), Software Defined Networking: control, data plane and virtualization; concepts on Cloud Computing: service and deployment models: data centers architectures, topologies, addressing, routing, traffic characteristics; Case Study: The Web of Things (IoT standards and protocols). - Programming Programming for Data Scientist using Python: computational environment (IPython and Jupyter); storage and manipulation (NumPy and Pandas); data visualization (Matplotlib).

The student is expected to pass a written and an oral exam.

The written and the oral exams will be evaluated on the basis of the following criteria: i) student’s knowledge of the concepts, methods, and technologies at the basis of the topics covered in the course; ii) student’s capacity for synthesis, clarity, and abstraction.

CONTENUTO NON PRESENTE

Slides presented during the lectures are made ​​available to students as reference material.