Building on students’ existing knowledge of data science techniques, this subject investigates the range of deployment options to automatically extract insights from the vast amount of data available. This includes traditional server and database deployment, as well as a range of popular cloud solutions including open-source alternatives. The advantages and disadvantages of different approaches will be discussed. In addition to popular big data analytics deployment options such as Amazon Web Services (AWS), Microsoft Azure, Google Big Query, Apache Spark, H20.ai and NoSQL, students will also learn about the MapReduce and Hadoop framework. Importantly, security implications associated with big data analytic deployments will be discussed, including knowledge of principles for cybersecurity and an ability to implement basic best practices.
|Academic unit:||Bond Business School|
|Subject title:||Infrastructure for Data Analytics|
Delivery & attendance
|Attendance and learning activities:||Attendance at all class sessions is expected. Students are expected to notify the instructor of any absences with as much advance notice as possible.|
|Prescribed resources:||No Prescribed resources. After enrolment, students can check the Books and Tools area in iLearn for the full Resource List.|
|[email protected] & Email:||[email protected] is the online learning environment at Bond University and is used to provide access to subject materials, lecture recordings and detailed subject information regarding the subject curriculum, assessment and timing. Both iLearn and the Student Email facility are used to provide important subject notifications. Additionally, official correspondence from the University will be forwarded to students’ Bond email account and must be monitored by the student.|
To access these services, log on to the Student Portal from the Bond University website as www.bond.edu.au
Assumed knowledge is the minimum level of knowledge of a subject area that students are assumed to have acquired through previous study. It is the responsibility of students to ensure they meet the assumed knowledge expectations of the subject. Students who do not possess this prior knowledge are strongly recommended against enrolling and do so at their own risk. No concessions will be made for students’ lack of prior knowledge.
Assumed Prior Learning (or equivalent):
Assurance of learning
Assurance of Learning means that universities take responsibility for creating, monitoring and updating curriculum, teaching and assessment so that students graduate with the knowledge, skills and attributes they need for employability and/or further study.
At Bond University, we carefully develop subject and program outcomes to ensure that student learning in each subject contributes to the whole student experience. Students are encouraged to carefully read and consider subject and program outcomes as combined elements.
Program Learning Outcomes (PLOs)
Program Learning Outcomes provide a broad and measurable set of standards that incorporate a range of knowledge and skills that will be achieved on completion of the program. If you are undertaking this subject as part of a degree program, you should refer to the relevant degree program outcomes and graduate attributes as they relate to this subject.
Subject Learning Outcomes (SLOs)
On successful completion of this subject the learner will be able to:
- Identify and apply frameworks for distributed storage and parallel processing using multiple (virtual) computers
- Describe a variety of cloud-based deployment options for big data analytics, and an ability to implement simple, prototype deployments
- Identify a variety of traditional database and server deployment options for big data analytics and implement simple, prototype deployments
- Articulate the security risks, particularly cybercrime associated with a variety of deployment options for big data analytics
- Identify the principles of cyber-safe deployment and implement basic safeguards to prototype deployments
- Critically compare the advantages and disadvantages of different deployment options for big data analytics
|Student Engagement||Participation||20%||Ongoing||2, 4.|
|Written Report||Report||40%||Week 10||3, 4, 5, 6.|
|Skills Test||Skills test||40%||Week 12||1, 2, 3, 4, 5, 6.|
- * Assessment timing is indicative of the week that the assessment is due or begins (where conducted over multiple weeks), and is based on the standard University academic calendar
- C = Students must reach a level of competency to successfully complete this assessment.
|High Distinction||85-100||Outstanding or exemplary performance in the following areas: interpretative ability; intellectual initiative in response to questions; mastery of the skills required by the subject, general levels of knowledge and analytic ability or clear thinking.|
|Distinction||75-84||Usually awarded to students whose performance goes well beyond the minimum requirements set for tasks required in assessment, and who perform well in most of the above areas.|
|Credit||65-74||Usually awarded to students whose performance is considered to go beyond the minimum requirements for work set for assessment. Assessable work is typically characterised by a strong performance in some of the capacities listed above.|
|Pass||50-64||Usually awarded to students whose performance meets the requirements set for work provided for assessment.|
|Fail||0-49||Usually awarded to students whose performance is not considered to meet the minimum requirements set for particular tasks. The fail grade may be a result of insufficient preparation, of inattention to assignment guidelines or lack of academic ability. A frequent cause of failure is lack of attention to subject or assignment guidelines.|
For the purposes of quality assurance, Bond University conducts an evaluation process to measure and document student assessment as evidence of the extent to which program and subject learning outcomes are achieved. Some examples of student work will be retained for potential research and quality auditing purposes only. Any student work used will be treated confidentially and no student grades will be affected.
Students must check the [email protected] subject site for detailed assessment information and submission procedures.
Policy on late submission and extensions
A late penalty will be applied to all overdue assessment tasks unless an extension is granted by the subject coordinator. The standard penalty will be 10% of marks awarded to that assessment per day late with no assessment to be accepted seven days after the due date. Where a student is granted an extension, the penalty of 10% per day late starts from the new due date.
Policy on plagiarism
The University’s Academic Integrity Policy defines plagiarism as the act of misrepresenting as one’s own original work: another’s ideas, interpretations, words, or creative works; and/or one’s own previous ideas, interpretations, words, or creative work without acknowledging that it was used previously (i.e., self-plagiarism). The University considers the act of plagiarising to be a breach of the Student Conduct Code and, therefore, subject to the Discipline Regulations which provide for a range of penalties including the reduction of marks or grades, fines and suspension from the University.
Feedback on assessment
Feedback on assessment will be provided to students within two weeks of the assessment submission due date, as per the Assessment Policy.
If you have a disability, illness, injury or health condition that impacts your capacity to complete studies, exams or assessment tasks, it is important you let us know your special requirements, early in the semester. Students will need to make an application for support and submit it with recent, comprehensive documentation at an appointment with a Disability Officer. Students with a disability are encouraged to contact the Disability Office at the earliest possible time, to meet staff and learn about the services available to meet your specific needs. Please note that late notification or failure to disclose your disability can be to your disadvantage as the University cannot guarantee support under such circumstances.
Additional subject information
As part of the requirements for Business School quality accreditation, the Bond Business School employs an evaluation process to measure and document student assessment as evidence of the extent to which program and subject learning outcomes are achieved. Some examples of student work will be retained for potential research and quality auditing purposes only. Any student work used will be treated confidentially and no student grades will be affected.
Data analytic systems are now used as a standard part of business operations. These systems require appropriate infrastructure to operate correctly and efficiently. We define infrastructure categories using a classical information systems approach.
Paper and electronic databases have been an important part of businesses for many decades. The rise of data analytics has meant that traditional relational approaches need to be supplemented by emerging NoSQL style databases. These appropriately capture increasingly important semi-structured and non-structured data.3.
The cloud gives public and company accessibility to a large amount of hardware and software infrastructure. The key issues including types of resources and security issues are discussed.2.
The base of any digital information system is hardware. The five different categories of hardware are discussed in relation to the needs of data analytic systems.6.
Given the growth of data that requires analysis, parallel resources are needed to process data, particularly for real time applications and continuous data. Paradigms such as MapReduce and tools like Hadoop and Spark are explored. Both coarse and fine grain parallelism techniques and metrics are given.1.
People are the most important part of any system. The organisation of people, in either traditional company structures and emerging flat and matrix structures, is discussed along with the advantages and disadvantages of each. The different job roles in data analytic system ecosystems are also explored.6.
The documentation of data analytic systems is incredibly important to ensure that users and developers have an accurate understanding of them. Three different aspects of this topic, algorithms, pseudocode, and business processes are explored. The tools of efficiency analysis and business process modelling notation are introduced.6.
In data driven systems, breaches of privacy and unethical use of data, are important considerations. Both potential threats and the design of mitigation measures are discussed. Different ethical frameworks are also considered.6.