Last Updated | S022020 |

### DEng 801

Unit Name | Advanced Data Analysis |

Unit Code | DENG801 |

Unit Duration | 24 week |

Award |
Doctor of Engineering Duration 3 years |

Year Level | Two |

Unit Creator / Reviewer | Dr Tony Auditore |

Core/Sub-Discipline: | Core |

Pre/Co-requisites | N/A |

Credit Points |
8 Total Program Credit Points 120 |

Mode of Delivery | Online or on-campus. |

Unit Workload |
20 hours per fortnight: Lecture - 1 hour Tutorial - 1 hour Assessments / Practical / Lab - 2 hours (where applicable) Personal Study recommended - 16 hours (guided and unguided) |

## Unit Description and General Aims

Research projects consist of collecting, analysing and making inference from data. This unit imparts to candidate’s procedural skills for gathering, organising, analysing and presenting quantitative data. It applies a scientific approach in converting data into information; and to analyse numerical data enabling the candidate to maximise their interpretation, understanding and application of their assigned research project.

At the conclusion of this unit students should be able to: develop research questions and link them to study designs, understand differences between quantitative and qualitative research and their application; be familiar with different methods for collecting and analysing qualitative data; be familiar with different methods for collecting quantitative data and basic concepts of probability sampling; understand simple descriptive analyses for quantitative data; interpret multiple sources of data; and develop evidence-based conclusions and recommendations.

Case studies and practical exercises are employed in this unit to assist students to convert data into information; that is, data that have been interpreted, understood and are useful to the student in their final research project.

## Learning Outcomes

On successful completion of this Unit, students are expected to be able to:

- Critique the procedural steps and skills for gathering, organising, analysing and presenting quantitative and qualitative data.

*Bloom’s Level 5*

- Evaluate the relation between research questions and study methodologies.

*Bloom’s Level 5*

- Integrate descriptive analyses for quantitative data.

*Bloom’s Level 6*

- Propose multiple sources of data and develop evidence-based conclusions and recommendations.

*Bloom’s Level 6*

**Bloom’s Taxonomy**

The cognitive domain levels of Bloom’s Taxonomy:

Bloom's level | Bloom's category | description |

1 | Remember | Retrieve relevant knowledge from long-term memory by recognising, identifying, recalling and retrieving. |

2 | Understand | Construct meaning from instructional messages by interpreting, classifying, summarising, inferring, comparing, contrasting, mapping and explaining. |

3 | Apply | Carrying out or using a procedure in a given situation by executing, implementing, operating, developing, illustrating, practicing and demonstrating. |

4 | Analyse | Deconstruct material and determine how the parts relate to one another and to an overall structure or purpose by differentiating, organising and attributing. |

5 | Evaluate | Make judgments based on criteria and standards by checking, coordinating, evaluating, recommending, validating, testing, critiquing and judging. |

6 | Create | Put elements together to form a coherent pattern or functional whole by generating, hypothesising, designing, planning, producing and constructing. |

## Student assessment

Assessment Type | When assessed (eg. After Topic 5) | Weighting (% of total unit marks) | Learning Outcomes Assessed |

Type: Multi-Choice Test Word length: n/a Questions from the content covered over the first weeks of instruction. Including: Qualitative and Quantitative data; Baye’s Theorem; Discrete random variables and probability distributions; Covariance and correlation; Bivariate normal distribution |
After topic 5 | 5% | 1,3 |

Type: Data analysis application / Self-assessment Word length: n/a A ‘self-originated’ question by the student to be answered within the boundaries, scope and study reference material provided by the examiner. Sampling; distributions; confidence intervals; |
After topic 9 | 35% | 2,3 |

Type: Statistical Report / Case study Word length: n/a Application and report on how statistics can be used to solve research real-world engineering problems. |
Final Week | 60% | 1-5 |

## Prescribed and Recommended Readings

Required Textbook(s)

The required text book provides important references in each chapter which are relevant to the subject matter. These references and those provided by the instructor will form the basis of the study material. The following textbook provides a study guide, and a student’s future reference book for statistical theory, numerous research methods, calculations and visuals.

- Montgomery, Douglas C. Runger, George C. (2013)
*The Role of Statistics in Engineering.*John Wiley & Sons, Inc., New York, USA. ISBN: 9781118539712 / 1118539710

Reference Materials

In addition to the above textbook, there are several useful statistical tools provided in an Engineering Statistical Handbook. This is in the form of an E-book and is freely available from the National Institute of Standards and Technology U.S. Department of Commerce website http://www.itl.nist.gov/div898/handbook/. Further subject related reference material may be obtained on-line from published journals, and websites. These resources may be obtainable from www.academia.edu.

Software Reference Material

Software can be applied in the processing of data and the professional presentation of computed results. There are numerous software packages which can be applied. For convenience and affordability, the Office .xls ‘add-on’ software XLSTAT-Base is proposed.

- The proposed XLSTAT-Base solution software is for data mining, machine learning, tests, data modelling and visualization. This software tool can be applied for data preparation and visualization, parametric and nonparametric tests, modelling methods (ANOVA, regression, generalized linear models, mixed models, nonlinear models), data mining features (principal component analysis, correspondence analysis) and clustering methods (Agglomerative Hierarchical Clustering, K-means). XLSTAT-Base also features machine learning methods (association rules, regression and classification trees and K-Nearest Neighbours), partial least square regression and more. It is IET’s viewpoint that XLSTST-Base will be a comprehensive and affordable research tool for the candidate’s final research project. Further reading can be obtained from website: (https://www.xlstat.com/en/solutions/base).
- Alternative software may be applied such as Maple, Quantum XL, MATLAB.

## Unit Content

#### Topic 1

*Students will review the techniques for collecting data and how to apply them,*

- Overview of data analysis.
- Different methods for collecting and analysing qualitative data.
- Different methods for collecting quantitative data.

#### Topic 2 and 3

*Students will be introduced to the subject matter of data collection by providing an overview on what the techniques for collecting data are, how to apply them, where to find data of any type, and the way to keep records. It is an introduction to the field of statistics and how researchers/engineers use statistical methodology as part of the research/engineering problem-solving process. Content will include:*

* *

- The role of statistics in engineering research.
- The engineering method and statistical thinking.
- Probability – Baye’s Theorem.
- Discrete random variables and probability distributions – includes discrete uniform, binomial, geometric and negative binomial, hypergeometric and Poisson distribution.
- Continuous random variables and probability distributions.

Of the above content it will be mandatory for the candidates to demonstrate their mastery of the appropriate theory and mathematics in the following topics:

- Probability – Baye’s Theorem.
- Discrete random variables and probability distributions.

#### Topic 4 and 5

*Students will be introduced to the concepts of joint probability distributions, and independence.*

- Two discrete random variables.
- Multiple discrete random variables.
- Two continuous random variables.
- Covariance and correlation.
- Bivariate normal distribution.
- Linear combinations of random variables.

Of the above content it will be mandatory for the candidates to demonstrate their mastery of the appropriate theory and mathematics in the following topics:

- Covariance and correlation.
- Bivariate normal distribution.

#### Topic 6 and 7

*Students will be introduced to the treatment of statistical methods; important properties of estimators; the method of likelihood and moments; and the central limit theorem. Content will include:*

- Treatment of statistical methods with random sampling.
- Data summary and description techniques.
- Including stem-and-leaf plots, histograms, box plots, and probability plotting; and several types of time series plots.
- Point estimation of parameters.
- Important properties of estimators.
- Method of maximum likelihood.
- Method of moments.
- Sampling distributions and the central limit theorem.

Of the above content it will be mandatory for the candidates to demonstrate their mastery of the appropriate theory and mathematics in the following topics:

- Point estimation of parameters.
- Method of maximum likelihood.
- Sampling distributions and the central limit likelihood.

#### Topic 8 and 9

*Students will be introduced to statistical intervals for single and two samples; and the methods for determining appropriate sample sizes. Content will include:*

- Interval estimation for a single sample.
- Confidence intervals for means, variances or standard deviations, and proportions and prediction and tolerance intervals.
- Hypothesis tests for a single sample.
- Tests and confidence intervals for two samples.
- Detailed information and examples of methods for determining appropriate sample sizes.
- Techniques applied to solve real-world engineering problems and understanding of these concepts.
- Logical, heuristic development of procedures.

- Tests and confidence intervals for two samples.
- Techniques applied to solve real-world engineering problems and understanding of these concepts.

#### Topics 10 and 11

*Students will be introduced to design and experiments with several factors. Content will include:*

- Two-factor factorial experiments.
- Factorial designs.
- Blocking and confronting in design.
- Fractional replication of the design.
- Response surface methods and designs.

- Factorial designs.
- Fractional replication of the design.

#### Topic 12

*Students will be introduced to how statistics can be used to solve research real-world engineering problems. In this regard, examples of the application of probability theory in risk assessment for safety criteria will be provided. An opportunity will be provided for a review of all work and to clarify outstanding issues. Instructors/facilitators may choose to focus on a specific area(s) of the unit.*

## Software/Hardware Used

#### Software

- Software: N/A
- Version: N/A
- Instructions: N/A
- Additional resources or files: N/A

#### Hardware

- N/A