Organisations use their data for decision support and to build data-intensive products and services. The collection of skills required by organisations to support these functions has been grouped under the term Data Science. This subject will articulate the expected output of data scientists and then equip students with the ability to deliver against these expectations. A particular focus will be given to the tools required to model, store, clean, manipulate, and ultimately extract information out of stored data.
|Academic unit:||Bond Business School|
|Subject title:||Data Science|
Delivery & attendance
|Attendance and learning activities:||Attendance at all class sessions is expected. Students are expected to notify the instructor of any absences with as much advance notice as possible.|
|Prescribed resources:||No Prescribed resources. After enrolment, students can check the Books and Tools area in iLearn for the full Resource List.|
|[email protected] & Email:||[email protected] is the online learning environment at Bond University and is used to provide access to subject materials, lecture recordings and detailed subject information regarding the subject curriculum, assessment and timing. Both iLearn and the Student Email facility are used to provide important subject notifications. Additionally, official correspondence from the University will be forwarded to students’ Bond email account and must be monitored by the student.|
To access these services, log on to the Student Portal from the Bond University website as www.bond.edu.au
Assurance of learning
Assurance of Learning means that universities take responsibility for creating, monitoring and updating curriculum, teaching and assessment so that students graduate with the knowledge, skills and attributes they need for employability and/or further study.
At Bond University, we carefully develop subject and program outcomes to ensure that student learning in each subject contributes to the whole student experience. Students are encouraged to carefully read and consider subject and program outcomes as combined elements.
Program Learning Outcomes (PLOs)
Program Learning Outcomes provide a broad and measurable set of standards that incorporate a range of knowledge and skills that will be achieved on completion of the program. If you are undertaking this subject as part of a degree program, you should refer to the relevant degree program outcomes and graduate attributes as they relate to this subject.
Subject Learning Outcomes (SLOs)
On successful completion of this subject the learner will be able to:
- Demonstrate proficiency in statistical learning techniques using the R programming language.
- Demonstrate ability to access data from databases using SQL, APIs and traditional formats.
- Build data-driven statistical models to address big data focused business problems.
- Apply data visualisation techniques to communicate solutions to management.
- Communicate technical solutions to non-technical stakeholders in a professional, concise written report.
|Skills Assignment||Decision Tree Assignment Students are required to apply the skills learned in lectures and workshops to data to solve a real-world problem.||10%||Week 7||1, 2, 3, 4.|
|Skills Assignment||Logistic Regression Assignment Students are required to apply the skills learned in lectures and workshops to data to solve a real-world problem.||10%||Week 8||1, 2, 3, 4.|
|Skills Assignment||Nearest-Neighbours Assignment Students are required to apply the skills learned in lectures and workshops to data to solve a real-world problem.||10%||Week 10||1, 2, 3, 4.|
|Skills Assignment||Clustering Assignment Students are required to apply the skills learned in lectures and workshops to data to solve a real-world problem.||10%||Week 11||1, 2, 3, 4.|
|Skills Assignment||Project Students are required to create a shiny visualisation dashboard using the functions created in the four previous assignments.||20%||Week 13||1, 2, 3, 4, 5.|
|Paper-based Examination (Closed)||Mid-semester examination addressing of theoretical content to date.||40%||Week 7 (Mid-Semester Examination Period)||1, 2, 3.|
- * Assessment timing is indicative of the week that the assessment is due or begins (where conducted over multiple weeks), and is based on the standard University academic calendar
- C = Students must reach a level of competency to successfully complete this assessment.
|High Distinction||85-100||Outstanding or exemplary performance in the following areas: interpretative ability; intellectual initiative in response to questions; mastery of the skills required by the subject, general levels of knowledge and analytic ability or clear thinking.|
|Distinction||75-84||Usually awarded to students whose performance goes well beyond the minimum requirements set for tasks required in assessment, and who perform well in most of the above areas.|
|Credit||65-74||Usually awarded to students whose performance is considered to go beyond the minimum requirements for work set for assessment. Assessable work is typically characterised by a strong performance in some of the capacities listed above.|
|Pass||50-64||Usually awarded to students whose performance meets the requirements set for work provided for assessment.|
|Fail||0-49||Usually awarded to students whose performance is not considered to meet the minimum requirements set for particular tasks. The fail grade may be a result of insufficient preparation, of inattention to assignment guidelines or lack of academic ability. A frequent cause of failure is lack of attention to subject or assignment guidelines.|
For the purposes of quality assurance, Bond University conducts an evaluation process to measure and document student assessment as evidence of the extent to which program and subject learning outcomes are achieved. Some examples of student work will be retained for potential research and quality auditing purposes only. Any student work used will be treated confidentially and no student grades will be affected.
Students must check the [email protected] subject site for detailed assessment information and submission procedures.
Policy on late submission and extensions
A late penalty will be applied to all overdue assessment tasks unless an extension is granted by the subject coordinator. The standard penalty will be 10% of marks awarded to that assessment per day late with no assessment to be accepted seven days after the due date. Where a student is granted an extension, the penalty of 10% per day late starts from the new due date.
Policy on plagiarism
University’s Academic Integrity Policy defines plagiarism as the act of misrepresenting as one’s own original work: another’s ideas, interpretations, words, or creative works; and/or one’s own previous ideas, interpretations, words, or creative work without acknowledging that it was used previously (i.e., self-plagiarism). The University considers the act of plagiarising to be a breach of the Student Conduct Code and, therefore, subject to the Discipline Regulations which provide for a range of penalties including the reduction of marks or grades, fines and suspension from the University.
Feedback on assessment
Feedback on assessment will be provided to students within two weeks of the assessment submission due date, as per the Assessment Policy.
If you have a disability, illness, injury or health condition that impacts your capacity to complete studies, exams or assessment tasks, it is important you let us know your special requirements, early in the semester. Students will need to make an application for support and submit it with recent, comprehensive documentation at an appointment with a Disability Officer. Students with a disability are encouraged to contact the Disability Office at the earliest possible time, to meet staff and learn about the services available to meet your specific needs. Please note that late notification or failure to disclose your disability can be to your disadvantage as the University cannot guarantee support under such circumstances.
Additional subject information
As part of the requirements for Business School quality accreditation, the Bond Business School employs an evaluation process to measure and document student assessment as evidence of the extent to which program and subject learning outcomes are achieved. Some examples of student work will be retained for potential research and quality auditing purposes only. Any student work used will be treated confidentially and no student grades will be affected.
An introduction to the field of Data Science, including the main skills and attributes of data scientists. An overview of data science terminology and the R programming language is also provided.
There is a growing trend in the quantitative fields towards producing reproducible research. The reasons for this trend and approaches to producing reproducible research using R are examined. Variable importance is also considered through the topic of Entropy.
Real world data is never conveniently packaged into bite-size files. In this topic, students learn to deal with the myriad of data formats available, as well as being introduced to the world of programmable APIs and databases.
Builds on previous topics with an in-depth exploration of databases and SQL. Students become proficient at reading relational database structures, and self-servicing their own data requirements using remote databases.
An introduction to decision trees, including the use of R and real-world data to construct and assess the quality of decision trees. The role of entropy in the underlying decision making framework is also examined.
The mathematical structure of logistic regression is introduced using R. This technique is also compared to decision trees to evaluate which techniques work best with different types of data.
Additional considerations in modelling are introduced including assessing model quality, comparing models statistically, and the dangers of overfitting. Confusion matrices and ROC curves are also introduced.
A common deployment scenario in data science is through Shiny data visualisation. Students learn about visualisation the Shiny framework and a variety of packages available in R to produce high quality visualisations.
The measurement of similarity and distance in n-dimensional space is explored, including the as the related data science technique, nearest neighbours. Applications using real world data in R to create and interpret nearest neighbour models are also considered.
The final data science technique on this subject, clustering, is introduced. The unsupervised learning technique known as clustering is introduced and compared with the nearest neighbours technique. Applications using real world data in R to create and interpret clustering models are also introduced.
Provides a review and integration of the concepts and techniques introduced throughout the subject from a business perspective. Additionally, the specific kinds of business problems that can be solved using data science and a preview of a career as a data scientist are discussed to provide further context for the content of the subject.