KISS Java EE, MicroProfile, AI, (Deep) Machine Learning
airhacks.fm podcast with adam bien - Un podcast de Adam Bien
Catégories:
An airhacks.fm conversation with Pavel Pscheidl (@PavelPscheidl) about: Pentium 1 with 12, 75 MHz, first hello world with 17, Quake 3 friend as programming coach, starting with Java 1.6 at at the university of Hradec Kralove, second "hello world" with Operation Flashpoint, the third "hello world" was a Swing Java application as introduction to object oriented programming, introduction to enterprise Java in the 3rd year at the university, first commercial banking Java EE 6 / WebLogic project in Prague with mobile devices, working full time during the study, the first Java EE project was really successful, 2 month development time, one DTO, nor superfluous layers, using enunciate to generate the REST API, CDI and JAX-RS are a strong foundation, the first beep, fast JSF, CDI and JAX-RS deployments, the first beep, the War of Frameworks, pragmatic Java EE, "no frameworks" project at telco, reverse engineering Java EE, getting questions answered at airhacks.tv, working on PhD and statistics, starting at h2o.ai, h2o is a sillicon valley startup, h2o started as a distributed key-value store with involvement of Cliff Click, machine learning algorithms were introduced on top of distributed cache - the advent of h2o, h2o is an opensource company - see github, Driverless AI is the commercial product, Driverless AI automates cumbersome tasks, all AI heavy lifting is written in Java, h2o provides a custom java.util.Map implementation as distributed cache, random forest is great for outlier detection, the computer vision library openCV, Gradient Boosting Machine (GBM), the opensource airlines dataset, monitoring Java EE request processing queues with GBM, Generalized Linear Model (GLM), GBM vs. GLM, GBM is more explained with the decision tree as output, XGBoost, at h2o XGBoost is written in C and comes with JNI Java interface, XGBoost works well on GPUs, XGBoost is like GBM but optimized for GPUs, Word2vec, Deep Learning (Neural Networks), h2o generates a directly usable archive with the trained model -- and is directly usable in Java, K-Means, k-means will try to find the answer without a teacher, AI is just predictive statistics on steroids, Isolation Random Forest, IRF was designed for outlier detection, and K-Means was not, Naïve Bayes Classifier is rarely used in practice - it assumes no relation between the features, Stacking is the combination of algorithms to improve the results, AutoML: Automatic Machine Learning, AutomML will try to find the right combination of algorithms to match the outcome, h2o provides a set of connectors: csv, JDBC, amazon S3, Google Cloud Storage, applying AI to Java EE logs, the amount of training data depends on the amount of features, for each feature you will need approx. 30 observations, h2o world - the conference, cancer prediction with machine learning, preserving wildlife with AI, using AI for spider categorization Pavel Pscheidl on twitter: @PavelPscheidl, Pavel's blog: pavel.cool