data science, big data, and artificial intelligence...
TRANSCRIPT
![Page 1: Data Science, Big Data, and Artificial Intelligence ...aptikom.or.id/.../05/0_Data-Science-Big-Data-and-Artificial-Aptikom.pdf · Artificial Intelligence •Teknologi: algoritma perangkat](https://reader034.vdocuments.mx/reader034/viewer/2022042909/5f3a15c191870346e317d341/html5/thumbnails/1.jpg)
Data Science, Big Data, and Artificial Intelligence: Concept, Context, and
Applications
Prof. Zainal A. Hasibuan, PhD.
Ketua Asosiasi Pendidikan Tinggi Informatikadan Komputer (APTIKOM)Webinar Aptikom 19 May, 2020
![Page 2: Data Science, Big Data, and Artificial Intelligence ...aptikom.or.id/.../05/0_Data-Science-Big-Data-and-Artificial-Aptikom.pdf · Artificial Intelligence •Teknologi: algoritma perangkat](https://reader034.vdocuments.mx/reader034/viewer/2022042909/5f3a15c191870346e317d341/html5/thumbnails/2.jpg)
Yes, We Are Connected!
Covid19 Proves The Concept of Connectivity
![Page 3: Data Science, Big Data, and Artificial Intelligence ...aptikom.or.id/.../05/0_Data-Science-Big-Data-and-Artificial-Aptikom.pdf · Artificial Intelligence •Teknologi: algoritma perangkat](https://reader034.vdocuments.mx/reader034/viewer/2022042909/5f3a15c191870346e317d341/html5/thumbnails/3.jpg)
Technologies That Make Things ConnectedArtificial Intelligence• Teknologi: algoritma perangkat lunak yang
mengotomatisasi tugas-tugas pengambilan keputusan yang kompleks untuk meniru proses dan indera pemikiran manusia
• Manfaat: dapat belajar, memahami, menalar, merencanakan dan bertindak ketika diasupidengan dataInternet of Things (IoT)
• Teknologi: ekosistem sensor, komputer tertanam, dan perangkat "pintar"
• Manfaat: mampu berkomunikasi di antara mereka sendiri dan dengan layanan cloud pribadi / publik untuk mengumpulkan, menganalisis, dan menyajikan data tentang dunia fisik3D Printing
• Teknologi: menciptakan objek tiga dimensi berdasarkan model digital dengan "mencetak" lapisan material yang berurutan
• Manfaat: berbagai bahan dapat digunakan, mis. kayu, kaca, sel hidup untuk bio-printing; meminimalkan limbah
Robotic
• Teknologi: mesin dengan sensor, kontrol, dan kecerdasan yang ditingkatkan yang digunakan untuk mengotomatisasi, menambah, atau membantu aktivitas manusia
• Manfaat: meningkatkan efisiensi dan produktivitas
Blockchain• Teknologi: buku kas digital yang menggunakan
algoritma perangkat lunak untuk merekam dan mengkonfirmasi transaksi dengan keandalan dan anonimitas
• Manfaat: meningkatkan keterlacakan, transparansi, efisiensi, meningkatkan keamananDrone
• Teknologi: Pesawat tidak berawak• Manfaat: sangat serbaguna karena variasi
besar dalam kapasitas, ukuran, kemampuan dan fungsinya
Virtual Reality (VR)• Teknologi: menyiratkan pengalaman
“immersion” lengkap, yang 100% dihasilkan komputer
• Manfaat: inovasi dapat disajikan tanpa benar-benar memproduksinya
Augmented Reality (AR)• Teknologi: menawarkan pengalaman dunia
nyata dengan hamparan yang dihasilkan komputer
• Manfaat: campuran dunia nyata dan komputer
![Page 4: Data Science, Big Data, and Artificial Intelligence ...aptikom.or.id/.../05/0_Data-Science-Big-Data-and-Artificial-Aptikom.pdf · Artificial Intelligence •Teknologi: algoritma perangkat](https://reader034.vdocuments.mx/reader034/viewer/2022042909/5f3a15c191870346e317d341/html5/thumbnails/4.jpg)
Family
Music
Sport
Friends
Pets
Basically, We are Networked Society
![Page 5: Data Science, Big Data, and Artificial Intelligence ...aptikom.or.id/.../05/0_Data-Science-Big-Data-and-Artificial-Aptikom.pdf · Artificial Intelligence •Teknologi: algoritma perangkat](https://reader034.vdocuments.mx/reader034/viewer/2022042909/5f3a15c191870346e317d341/html5/thumbnails/5.jpg)
Potensi Implementasi Data Science di Indonesia
250 Juta Penduduk
1.340 Suku Bangsa
17.508 Pulau
We are big
746 Bahasa Daerah
We are adaptive
132,7 JutaPengguna Internet
106 Juta Pengguna Aktif Sosial Media
371,4 JutaPelanggan Ponsel
Bonus Demografi Usia Produktif
Ekonomi Tumbuh
We have opportunity
Politik dan Keamanan Stabil
![Page 6: Data Science, Big Data, and Artificial Intelligence ...aptikom.or.id/.../05/0_Data-Science-Big-Data-and-Artificial-Aptikom.pdf · Artificial Intelligence •Teknologi: algoritma perangkat](https://reader034.vdocuments.mx/reader034/viewer/2022042909/5f3a15c191870346e317d341/html5/thumbnails/6.jpg)
Data Science Extracts Knowledge & Insights From Big Data
![Page 7: Data Science, Big Data, and Artificial Intelligence ...aptikom.or.id/.../05/0_Data-Science-Big-Data-and-Artificial-Aptikom.pdf · Artificial Intelligence •Teknologi: algoritma perangkat](https://reader034.vdocuments.mx/reader034/viewer/2022042909/5f3a15c191870346e317d341/html5/thumbnails/7.jpg)
Forming Society 5.0: A Human-Centered Society
![Page 8: Data Science, Big Data, and Artificial Intelligence ...aptikom.or.id/.../05/0_Data-Science-Big-Data-and-Artificial-Aptikom.pdf · Artificial Intelligence •Teknologi: algoritma perangkat](https://reader034.vdocuments.mx/reader034/viewer/2022042909/5f3a15c191870346e317d341/html5/thumbnails/8.jpg)
The Context of Data Science, Big Data, and Artificial Intelligence
Big Data (BD)
![Page 9: Data Science, Big Data, and Artificial Intelligence ...aptikom.or.id/.../05/0_Data-Science-Big-Data-and-Artificial-Aptikom.pdf · Artificial Intelligence •Teknologi: algoritma perangkat](https://reader034.vdocuments.mx/reader034/viewer/2022042909/5f3a15c191870346e317d341/html5/thumbnails/9.jpg)
Definitions, Techniques, and Examples of DS, BD, and AI
Keyword Definition Techniques & Analysis
Example & Application
Data Science
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from many structural and unstructured data.
K-Means, LinearRegression, Naïve Bayesian, etc.
Personalized healthcare recommendations
Big Data Big Data is a massive volume of both structured and unstructured data that is so large & difficult to process using traditional database and software techniques.
Education Performance Analysis, Sentiment Analysis, Customer Behavior Analysis
Big Data of National Education System
Artificial Intelligence
Artificial intelligence (AI) is the ability of a computer program or a machine to think and learn.
Rule-based systems, Neural Networks, Fuzzy Models, etc.
Plagiarism Checkers
![Page 10: Data Science, Big Data, and Artificial Intelligence ...aptikom.or.id/.../05/0_Data-Science-Big-Data-and-Artificial-Aptikom.pdf · Artificial Intelligence •Teknologi: algoritma perangkat](https://reader034.vdocuments.mx/reader034/viewer/2022042909/5f3a15c191870346e317d341/html5/thumbnails/10.jpg)
Why Data Science, Big Data, and Artificial Intelligence are Important?
BIG DATA:
ValueVolumeVarietyVelocityVeracity
![Page 11: Data Science, Big Data, and Artificial Intelligence ...aptikom.or.id/.../05/0_Data-Science-Big-Data-and-Artificial-Aptikom.pdf · Artificial Intelligence •Teknologi: algoritma perangkat](https://reader034.vdocuments.mx/reader034/viewer/2022042909/5f3a15c191870346e317d341/html5/thumbnails/11.jpg)
Big Data: More, Messy, Good Enough• In this new world we can analyze far MORE data.• Big data gives us an especially clear view of the granular:
subcategories and submarkets that samples cannot assess.• As scale increases, the number of inaccuracies increases as well
(Messy).• A move away from the search for causality to discover patterns and
correlations.• Big data is about WHAT, not WHY.• Big data changes the nature of business, markets, and society.• Values is shifted from physical infrastructure to intangibles such as
brands and intellectual property.• Big data is the oil of the information economy.• As individual shifts from privacy to probability: likelihood one get a
heart attack, default on a mortgage, commit crime, climate change, eradicating diseases, fostering good governing and economic development.
![Page 12: Data Science, Big Data, and Artificial Intelligence ...aptikom.or.id/.../05/0_Data-Science-Big-Data-and-Artificial-Aptikom.pdf · Artificial Intelligence •Teknologi: algoritma perangkat](https://reader034.vdocuments.mx/reader034/viewer/2022042909/5f3a15c191870346e317d341/html5/thumbnails/12.jpg)
• deals with both structured and unstructured data
• a field that includes everything that is associated with the cleansing, preparation and final analysis of data
• combines the programming, logical reasoning, mathematics and statistics
• cleanses, prepares and aligns the data
• an umbrella of several techniques that are used for extracting the information and the insights of data
Source: Leonard Heiler, 2017. https://www.datasciencecentral.com/profiles/blogs/difference-of-data-science-machine-learning-and-data-mining
![Page 13: Data Science, Big Data, and Artificial Intelligence ...aptikom.or.id/.../05/0_Data-Science-Big-Data-and-Artificial-Aptikom.pdf · Artificial Intelligence •Teknologi: algoritma perangkat](https://reader034.vdocuments.mx/reader034/viewer/2022042909/5f3a15c191870346e317d341/html5/thumbnails/13.jpg)
Paradigm Shift of Big Data Computation in Data Science: From Factual to Potential
Foundational
• What happened?
• When and where?
• How much?
Advanced, Predictive
• What will
happen?
• What will be the impact?
• Big Data Analysis
• Strategic
Direction
• Interpretative
• Enterprise data
Data
integration
• Descriptive
• Basic reporting
Data
reporting
• Enterprise analytics
• Evidence-based medicine
• Outcomes analytics
Data
analytics
• Population behavior
• Innovation
Data
Predictive
Prescriptive
• What are potential
scenarios?
• What is the best course?
• How can we pre-empt and
mitigate the crisis?
• Structure and unstructure
data
• Future Direction
Source: (Hasibuan 2016)
Relational
• How one data
relates to another data
• Rules and
method
Role of Big Data
![Page 14: Data Science, Big Data, and Artificial Intelligence ...aptikom.or.id/.../05/0_Data-Science-Big-Data-and-Artificial-Aptikom.pdf · Artificial Intelligence •Teknologi: algoritma perangkat](https://reader034.vdocuments.mx/reader034/viewer/2022042909/5f3a15c191870346e317d341/html5/thumbnails/14.jpg)
Research Paradigm Shift: From Data to Big Data
Big Data
Sampled Data
Data
• Population
• Heterogeneous
• Pattern
• Representation
• Inference
• Hypothesis
• Limited
• Homogeneous
![Page 15: Data Science, Big Data, and Artificial Intelligence ...aptikom.or.id/.../05/0_Data-Science-Big-Data-and-Artificial-Aptikom.pdf · Artificial Intelligence •Teknologi: algoritma perangkat](https://reader034.vdocuments.mx/reader034/viewer/2022042909/5f3a15c191870346e317d341/html5/thumbnails/15.jpg)
How to Mechanize DS, BD, and AI?• An organization that has big amounts of data
gain competitive advantages in its playing field.
• The more data an organization has, the more accurate its descriptions, predictions, and prescriptions can be.
• Data Science, Big Data, and Artificial Intelligence play significant roles to present the solutions
• This means making use of mathematical models to create algorithms to identify, classify, cluster, predict, learn, and to process data.
![Page 16: Data Science, Big Data, and Artificial Intelligence ...aptikom.or.id/.../05/0_Data-Science-Big-Data-and-Artificial-Aptikom.pdf · Artificial Intelligence •Teknologi: algoritma perangkat](https://reader034.vdocuments.mx/reader034/viewer/2022042909/5f3a15c191870346e317d341/html5/thumbnails/16.jpg)
DS, BD, and AI: Methodologies and Algorithms
Key Word Methodology Algorithms
Data Science Classification (to classify), Regression (to predict), Similarity (to correlate)
Support vector machine (SVM), Linear Regression , Association Rule Mining, etc.
Big Data Data Mining, MachineLearning, NLP
Support vector machine (SVM), K-Mean, Naïve Bayes, etc
Artificial Intelligent Supervised Learning, Unsupervised Learning, and Reinforcement Learning
Support vector machine (SVM), ), K-Mean, Naïve Bayesian, Convolution Neural Network (CNN), etc.
![Page 17: Data Science, Big Data, and Artificial Intelligence ...aptikom.or.id/.../05/0_Data-Science-Big-Data-and-Artificial-Aptikom.pdf · Artificial Intelligence •Teknologi: algoritma perangkat](https://reader034.vdocuments.mx/reader034/viewer/2022042909/5f3a15c191870346e317d341/html5/thumbnails/17.jpg)
Example of Linear Regression
• One of the most widely-used methods of statistical analysis
• Applicable to many problems, particularly when the expected output is a score rather than a category
• Good for predicting trends and to forecast the effects of a new policy or other change.
https://www.kdnuggets.com/2016/08/10-algorithms-machine-learning-engineers.html
![Page 18: Data Science, Big Data, and Artificial Intelligence ...aptikom.or.id/.../05/0_Data-Science-Big-Data-and-Artificial-Aptikom.pdf · Artificial Intelligence •Teknologi: algoritma perangkat](https://reader034.vdocuments.mx/reader034/viewer/2022042909/5f3a15c191870346e317d341/html5/thumbnails/18.jpg)
• Learns to define a hyperplane to separate data into two classes
• Can help figure out an underlying separation mechanism between people
• some of the biggest problems that have been solved using SVMs (with suitably modified implementations) are display advertising, human splice site recognition, image-based gender detection, large-scale image classification
Example of Support vector machine (SVM)
Source: James Le, 2016https://www.kdnuggets.com/2016/08/10-algorithms-machine-learning-engineers.html
https://www.kdnuggets.com/2016/08/10-algorithms-machine-learning-engineers.html
![Page 19: Data Science, Big Data, and Artificial Intelligence ...aptikom.or.id/.../05/0_Data-Science-Big-Data-and-Artificial-Aptikom.pdf · Artificial Intelligence •Teknologi: algoritma perangkat](https://reader034.vdocuments.mx/reader034/viewer/2022042909/5f3a15c191870346e317d341/html5/thumbnails/19.jpg)
• Not one algorithm, but a family of simple probabilistic classifiers based on applying Bayes’ theorem with strong (naive) independence assumptions between the features.
• The algorithm learns to predict an attribute based on other, known features.
• Assumes all attributes of an item are independent of each other
Example of Naïve Bayesian
http://uc-r.github.io/naive_bayes
![Page 20: Data Science, Big Data, and Artificial Intelligence ...aptikom.or.id/.../05/0_Data-Science-Big-Data-and-Artificial-Aptikom.pdf · Artificial Intelligence •Teknologi: algoritma perangkat](https://reader034.vdocuments.mx/reader034/viewer/2022042909/5f3a15c191870346e317d341/html5/thumbnails/20.jpg)
10 Algoritma untuk Ahli Big DataSource: James Le, 2016
https://www.kdnuggets.com/2016/08/10-algorithms-machine-learning-engineers.html
Algoritma Penjelasan Gambar Sumber
K-Means Clustering
• Sederhana, Algoritma pembelajaran unsupervised yang sering digunakan pada himpunan big data.
• Paling cocok untuk pengelompokan tingkat tinggi, skala besar
https://www.kdnuggets.com/2016/08/10-algorithms-machine-learning-engineers.html
Association Rule Mining
• Algoritma pembelajaran yang mencari asosiasi yang terjadi padafrekuensi tinggi
• Dapat mengidentifikasi asosiasiyang mungkin tidak Anda harapkandalam pengambilan sampel acak
https://gerardnico.com/data_mining/association
Linear Regression
• Salah satu metode analisis statistikyang paling banyak digunakan
• Dapat diterapkan untuk banyakmasalah, terutarama ketikakeluaran yang diharapkan adalahskor daripada kategori
• Baik untuk memprediksi tren dan
https://www.kdnuggets.com/2016/08/10-algorithms-machine-learning-engineers.html
![Page 21: Data Science, Big Data, and Artificial Intelligence ...aptikom.or.id/.../05/0_Data-Science-Big-Data-and-Artificial-Aptikom.pdf · Artificial Intelligence •Teknologi: algoritma perangkat](https://reader034.vdocuments.mx/reader034/viewer/2022042909/5f3a15c191870346e317d341/html5/thumbnails/21.jpg)
Algoritma Penjelasan Gambar Sumber
Logistic Regression
• Digunakan untuk menemukankeberhasilan kegagalan suatuperistiwa tertentu
• Algoritma klasifikasi.• cara statistik yang kuat untuk
memodelkan hasil binomial dengansatu atau lebih variabel penjelas
• mengukur hubungan antara kategorivariabel dependen dan satu ataulebih variabel independen denganmengestimasi probaliitasmenggunakan fungsi logistik
Source: James Le, 2016https://www.kdnuggets.com/2016/08/10-algorithms-machine-learning-engineers.html
https://www.kdnuggets.com/2016/08/10-algorithms-machine-learning-engineers.html
![Page 22: Data Science, Big Data, and Artificial Intelligence ...aptikom.or.id/.../05/0_Data-Science-Big-Data-and-Artificial-Aptikom.pdf · Artificial Intelligence •Teknologi: algoritma perangkat](https://reader034.vdocuments.mx/reader034/viewer/2022042909/5f3a15c191870346e317d341/html5/thumbnails/22.jpg)
Algoritma Penjelasan Gambar Sumber
C4.5 • Algoritma pembelajaran supervised
• Dikembangkan oleh John Ross Quinlan yang menciptakan decision tree (pengambilan keputusan)
• Membuat pohon keputusan dari input yang telah diklasifikasi
• Pohon keputusan dapat digunakan sebagai alat diagnostik
https://github.com/barisesmer/C4.5
Support vector machine (SVM)
• Belajar untuk mendefinisikan hyperplane untuk memisahkan data menjadi dua kelas
• Dapat membantu mencari tahu dasar mekanisme pemisahan antar orang-orang
• Beberapa masalah besar telah dipecahkan menggunakan SVM (dengan implementasi yang dimodifikasi secara tepat) adalah iklan bergambar, pengenalan situs sambungan manusia, deteksi gender berbasis gambar, klasifikasi gambar skala besar.
https://www.kdnuggets.com/2016/08/10-algorithms-machine-learning-engineers.html
![Page 23: Data Science, Big Data, and Artificial Intelligence ...aptikom.or.id/.../05/0_Data-Science-Big-Data-and-Artificial-Aptikom.pdf · Artificial Intelligence •Teknologi: algoritma perangkat](https://reader034.vdocuments.mx/reader034/viewer/2022042909/5f3a15c191870346e317d341/html5/thumbnails/23.jpg)
Algoritma Penjelasan Gambar Sumber
Apriori • Algoritma pencocokan kesamaan• Biasa digunakan dalam basis data transaksional dengan jumlah transaksibesar, matriks sparse, dengan item (atribut) di sepanjang sumbuhorizontal, dan transaksi di sepanjangsumbu vertikal.
• Jalankan dengan tingkat overhead komputasi yang tinggi.
https://www.analyticsvidhya.com/blog/2014/08/effective-cross-selling-market-basket-analysis/
8. EM (expectation-maximization)
• Algoritma Pengelompokan yang digunakan untuk menemukan pengetahuan
• Menemukan parameter maksimum (lokal) dari model statistik dalam kasus di mana persamaan tidak dapat diselesaikan secara langsung.
• Memprediksi data yang dapat digunakan dalam metode analisis statistik lainnya.
https://medium.com/@thiagoricieri/understanding-expectation-maximization-and-soft-clustering-4645e997cdb6
EM (expectation-maximization)
• Pengelompokan EM dari data Faithful eruption.
• Model acak awal (yang, karena skala sumbu yang berbeda, tampak bidang yang sangat datar dan lebar) cocok dengan data yang diamati.
• Pada iterasi pertama, model
https://en.wikipedia.org/wiki/Expectation%E2%80%93maximization_algorithm
![Page 24: Data Science, Big Data, and Artificial Intelligence ...aptikom.or.id/.../05/0_Data-Science-Big-Data-and-Artificial-Aptikom.pdf · Artificial Intelligence •Teknologi: algoritma perangkat](https://reader034.vdocuments.mx/reader034/viewer/2022042909/5f3a15c191870346e317d341/html5/thumbnails/24.jpg)
Algoritma Penjelasan Gambar Sumber
Adaptive Boosting(AdaBoost)
• Metode umum yang dapat diterapkan pada sejumlah pengklasifikasi
• Suatu algoritma yang membangun sebuah classifier dan kemudian meningkatkannya
• Mengoptimalkan kemampuan untuk mempelajari mesin yang berpartisipasi.
Source: Brendan Marsh,2016
Naïve Bayesian
• Bukan satu algoritma, tetapi keluarga klasifikasi probabilistik sederhana berdasarkan penerapan teorema Bayes dengan asumsi kemandirian yang kuat (naif) di antara fitur-fiturnya.
• Algoritma belajar untuk memprediksi atribut berdasarkan fitur lain yang diketahui.
• Mengasumsikan semua atribut item tidak tergantung satu sama lain
http://uc-r.github.io/naive_bayes
![Page 25: Data Science, Big Data, and Artificial Intelligence ...aptikom.or.id/.../05/0_Data-Science-Big-Data-and-Artificial-Aptikom.pdf · Artificial Intelligence •Teknologi: algoritma perangkat](https://reader034.vdocuments.mx/reader034/viewer/2022042909/5f3a15c191870346e317d341/html5/thumbnails/25.jpg)
![Page 26: Data Science, Big Data, and Artificial Intelligence ...aptikom.or.id/.../05/0_Data-Science-Big-Data-and-Artificial-Aptikom.pdf · Artificial Intelligence •Teknologi: algoritma perangkat](https://reader034.vdocuments.mx/reader034/viewer/2022042909/5f3a15c191870346e317d341/html5/thumbnails/26.jpg)
Conclusion• These methodologies, techniques, and algorithms are the
tools for Data Science, big data, artificial intelligence use to classifying data, identifying similarities, and predicting trends.
• Using Data Science to analyze Big Data is an effective way of tapping into the inherent value of large data into meaningful information and knowledge. Furthermore Artificial Intelligence uses the results to learn and re-learn the system to gain business intelligence and insight.
• Big Data of an organization should be collected continuously, in order to grow in volume and diversity : spacially and temporally.