# Multivariate Statistical Analysis

**CORSO DI STUDIO: Corso di Laurea Magistrale in Economics-Scienze Economiche a.a. 2015/2016**

## Denominazione insegnamento/Course Title

Multivariate Statistical Anaysis

Lingua dell’insegnamento: English

Crediti e ore di lezione: 6 CFU

Moduli: NO

Settore/i scientifico disciplinare: SECS-S/01

Docente: Laura Pagani

Indirizzo email: laura.pagani@uniud.it

Pagina web personale: http://people.uniud.it/page/laura.pagani

## Prerequisiti e propedeuticità/Requirements

PREREQUISITI: Appropriate background is one semester of basic statistics or equivalent

PROPEDEUTICITA’: -

## Conoscenze e abilità da acquisire/Knowledge and skills

Multivariate Statistical techniques are important tools of analysis in fields as: Economics, Sociology, Finance, Production, Accounting, Marketing,… etc.

The goal of this course is to develop skills with a range of procedures and programs for multivariate data analysis, with an overview of actual applications in the fields of economy and/or sociology.

Capacità relative alle discipline

The course will address both the underlying theory of a particular technique and problems of the related applications. As such, a reasonable level of competence in both statistics and mathematics is needed.

At the end of the course a student will be able to

• make sense of the math behind many multivariate statistical analyses

• understand the ideas behind the methods and reproduce the proofs discussed in class

• identify appropriate multivariate methods for a given research question

• prepare data for analysis: import and export of data sets, explore the data with the aim to detect missing values, outliers and others data problem using R

• apply the selected multivariate techniques using R

• interpret the results of the analysis

• present results in an effective way.

Capacità trasversali /soft skills

• the student will be able to decide the appropriate method to use in his analysis with reference to the data structure of the data base

• the student will refine his written and oral communication skills

• develop problem solving skills in a team environment

• engage in active, student-directed learning in preparation for professional life

## Programma e contenuti dell'insegnamento/Course description

The course is logically divided into lecture and lab. Lectures cover the material in the assigned readings, including a review and discussion of the most relevant material with illustrated examples. The main purpose of the lectures is to provide students with the theoretical/conceptual background needed to complete the lab projects. At the end of a set of lectures related to a specific multivariate technique there is a lab section. The purpose of the lab

section is to provide students with hands-on experience analyzing real data sets (either their

own or one provided by the instructor) using the R programming language and statistical

computing environment. The list of topics is

• Vector and matrix algebra useful for multivariate analysis

• Introduction to multivariate explorative data analysis (EDA)

• Principal component analysis

• Explorative factor analysis

• Analysis of categorical data

• Simple correspondence analysis

• Distances, dissimilarity, classification and clustering

## Attività di apprendimento e metodi didattici previsti/Teaching and Learning activities

The instructor will try to use an interactive format, and encourage everyone to actively participate and to ask questions when something is unclear.

Lectures will be used both for explaining the statistical methods and for illustrating some examples using results on observed data set.

Most of materials (slides and data sets) will be posted in course web page (Materiale didattico)

## Testi / Bibliografia/Bibliography

Johnson, R.A. and Wirchern D.W. (2007), *Applied Multivariate Statistical Analysis. Sixth Edition*. Prentice-Hall International Editions.

Zani, S. and Cerioli, A. (2007) *Analisi dei dati e data mining per le decisioni aziendali*, Milano: Giu_r_e.

Everitt, B. and Hothorn, T. (2011) *An Introduction to Applied Multivariate Analysis with R. Springer*.

## Modalità di verifica dell'apprendimento/Examination

The exam will be based on all material discussed in class. This course utilizes project-based learning and student-directed learning as the primary means of student evaluation. Consequently, grading is based largely on student participation and performance in student-directed projects. Grading is based on threel items;

• 20% homework. A few homework sets will be suggested. Students may work in groups but each student should hand in their own individual problem set.

• 40% project. Projects are applied: analysis of a multivariate dataset by means of techniques presented in the course using R. Students may work in groups (strongly recommended). The instructor will suggest the topic of the projects. Groups or individual students may propose projects. A report (of about 15-20 pages with: introduction, objective of the analysis, dataset description, methods definition, statistical analysis and conclusions) and a presentation (scheduling announced in due time) of the projects will be made.

• 40% final exam. The exam will include four theoretical questions.

## Strumenti a supporto della didattica/ Further readings and support material

Some material (slides, data set, R script) will be posted in course web page (Materiale didattico). Some R material (freely downloadable report, blogs, etc.) will be suggested.

## Tesi di laurea/Thesis

Dissertations in multivariate statistical analysis may be theoretical but mainly are applications of method introduced in the course or new methods or case studies.

## Note/Remarks

The topics introduced in the multivairate statistica analysis are very general and can be used jointly also in topics adressed in other courses of this degree but also of other degrees.

Students who do attend the class must contact the instructor to have information about the exam modality (the program is the same).