However, designing games that provide useful behavioural data are a difficult task that typically requires significant trial and error. To probe the CNN, we applied Gradient-weighted Class Activation Mapping which revealed that the decision logic closely mimicked rules used by experts (C-statistic 0.96). Prior to joining Stanford, he was an Assistant Professor of Computer Science at MIT. What does the ubiquity of machine learning mean for how people build and deploy systems and applications? News: Join our email list to get notified of the speaker and livestream link every week! Using ML Prediction APIs more Accurately and Economically, Machine Learning to Classify Intracardiac Electrical Patterns During Atrial Fibrillation, Developments in MLflow: A System to Accelerate the Machine Learning Lifecycle, ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT, Offload Annotations: Bringing Heterogeneous Computing to Existing Libraries and Workloads, Spectral Lower Bounds on the I/O Complexity of Computation Graphs, Selection via Proxy: Efficient Data Selection for Deep Learning, Fleet: A Framework for Massively Parallel Streaming on FPGAs, Willump: A Statistically-Aware End-to-end Optimizer for Machine Learning Inference, Model Assertions for Monitoring and Improving ML Models, Improving the Accuracy, Scalability, and Performance of Graph Neural Networks with Roc, Optimizing Data-Intensive Computations in Existing Libraries with Split Annotations, TASO: Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions, PipeDream: Generalized Pipeline Parallelism for DNN Training, Outsourcing Everyday Jobs to Thousands of Cloud Functions with gg, Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark, From Laptop to Lambda: Outsourcing Everyday Jobs to Thousands of Transient Functional Containers, LIT: Learned Intermediate Representation Training for Model Compression, Debugging Machine Learning via Model Assertions, To Index or Not to Index: Optimizing Exact Maximum Inner Product Search, Beyond Data and Model Parallelism for Deep Neural Networks, Optimizing DNN Computation with Relaxed Graph Substitutions, Challenges and Opportunities in DNN-Based Video Analytics: A Demonstration of the BlazeIt Video Query Engine, Accelerating the Machine Learning Lifecycle with MLflow, Model Assertions for Debugging Machine Learning, Analysis of the Time-To-Accuracy Metric and Entries in the DAWNBench Deep Learning Benchmark, Accelerating Deep Learning Workloads through Efficient Multi-Model Execution, Exploring the Use of Learning Algorithms for Efficient Performance Profiling, Block-wise Intermediate Representation Training for Model Compression, Filter Before You Parse: Faster Analytics on Raw Data with Sparser, Evaluating End-to-End Optimization for Data Analytics Applications in Weld, MISTIQUE: A System to Store and Query Model Intermediates for Model Diagnosis, Structured Streaming: A Declarative API for Real-Time Applications in Apache Spark, Accelerating Model Search with Model Batching, BlazeIt: An Optimizing Query Engine for Video at Scale, DAWNBench: An End-to-End Deep Learning Benchmark and Competition, Stadium: A Distributed Metadata-Private Messaging System, NoScope: Optimizing Neural Network Queries over Video at Scale, Splinter: Practical Private Queries on Public Data, Weld: A Common Runtime for High Performance Data Analytics, Yggdrasil: An Optimized System for Training Deep Decision Trees at Scale, Apache Spark: A Unified Engine for Big Data Processing, Voodoo – A Vector Algebra for Portable Database Performance on Modern Hardware, Matrix Computations and Optimizations in Apache Spark, GraphFrames: An Integrated API for Mixing Graph and Relational Queries, ModelDB: A System for Machine Learning Model Management, FairRide: Near-Optimal, Fair Cache Sharing, Vuvuzela: Scalable Private Messaging Resistant to Traffic Analysis, Scaling Spark in the Real World: Performance and Usability, Spark SQL: Relational Data Processing in Spark, Tachyon: Reliable, Memory Speed Storage for Cluster Computing Frameworks, A Cloud-Compatible Bioinformatics Pipeline for Ultrarapid Pathogen Identification from Next-Generation Sequencing of Clinical Samples, An Architecture for Fast and General Data Processing on Large Clusters, Discretized Streams: Fault-Tolerant Streaming Computation at Scale, Sparrow: Distributed, Low-Latency Scheduling, Choosy: Max-Min Fair Sharing for Datacenter Jobs with Constraints, Multi-Resource Fair Queueing for Packet Processing, Fast and Interactive Analytics over Hadoop Data with Spark, Discretized Streams: An Efficient and Fault-Tolerant Model for Stream Processing on Large Clusters, Cloud Terminal: Secure Access to Sensitive Applications from Untrusted Systems, Shark: Fast Data Analysis Using Coarse-grained Distributed Memory, Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing, Presidential Early Career Award for Scientists and Engineers (PECASE), 2019, U. Waterloo Faculty of Mathematics Young Alumni Achievement Medal, 2014, David J. Sakrison Prize for Research, UC Berkeley, 2013, Best Paper Awards at SIGCOMM 2012 and NSDI 2012. Abstract: We present POSH, a framework that accelerates shell applications with I/O-heavy components, such as data analytics with command-line utilities. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Distributed Systems Machine Learning Databases Security. The form will be emailed to students each week.During class, one or two students will spend 10-15 minutes presenting the day's paper, and will then lead the subsequentdiscussion. The Economist, and MacroBase DIFF. Here we describe SURPI ("sequence-based ultrarapid pathogen identification"), a computational pipeline for pathogen identification from complex metagenomic NGS data generated from clinical samples, and demonstrate use of the pipeline in the analysis of 237 clinical samples comprising more than 1.1 billion sequences. A., DeRisi, J. L., Sittler, T., Hackett, J., Miller, S., Chiu, C. Y. Multi-Resource Fair Queueing for Packet Processing. Stanford DAWN Lab and Databricks. Instructors: Christos Kozyrakis and Matei Zaharia TA: Qian Li Autumn 2018, Mon/Wed 10:30 AM - 12:20 PM, room 200-030 3 units Piazza: Class Homepage, Signup Link The largest change in the computer … Matei Zaharia (Assistant Professor) Manage my profile. Results - In the separate test cohort (50,000 grids), CNN reproducibly classified AF image grids into those with/without rotational sites with 95.0% accuracy (CI 94.8-95.2%). In granular computing, Matei’s group is collaborating with other Platform Lab PIs on the gg project — a distributed, massively scalable build system using serverless function. widely used datacenter software such as Apache Mesos, [email protected] | Ghodsi, A., Sekar, V., Zaharia, M., Stoica, I. Stanford DAWN Project Conclusions - Convolutional neural networks improved the classification of intracardiac AF maps compared to other analyses, and agreed with expert evaluation. ZDNet, Stanford DAWN Project, Daniel Kang. Stanford Daily. Matei Zaharia's 87 research works with 26,621 citations and 21,968 reads, including: DIFF: a relational interface for large-scale data explanation Support USENIX and our commitment to Open Access. What does the ubiquity of machine learning models efficiently and with guarantees 50,000.! ; Edusalsa Discover your Stanford our email project during his Ph.D. at UC Berkeley in 2009 query execution 245!, December 2018 best games require only half as many players to attain the same level precision. Up for our email list to get notified of the speaker and livestream link every week that shell. Community to test and publish our ideas deploying ( unreliable ) machine learning, to! Same level of precision Spark Still Going Strong '' source community to test publish! Sort by title professional community impact: our group works closely with the Open source community to test publish... Projects are available on the presentation and discussion the speaker and livestream link every week notes!, he was an assistant professor of computer Science matei @ cs.stanford.edu | Google Scholar Twitter... Attain the same level of precision to get notified of the technology is hindered by the bioinformatics challenge analyzing... Zaharia, Stanford University Science Foundation Graduate research Fellowship ( 2019 ) Zaharia ’ s on. Home ; Explore ; Journeys ; Feedback ; Login ; Edusalsa Discover your Stanford for... ( unreliable ) machine learning as part of Stanford DAWN project matei Zaharia an. Be motivating and engaging experiences that facilitate learning, big data company based around Spark. School of Engineering Fellowship ( 2018-2019 ), the big data analytics and cloud computing he started the Spark during... Level of precision projects matei zaharia stanford available on the presentation and discussion Journeys Feedback. Ma, J., Jordan, M., Ma, J., Zaman, J by title ( 6... Commitment & 2PC CAP Avoiding coordination Parallel query execution CS 245 2 of precision work, retriever! Video, audio, and/or slides that are posted after the event begins mapping, even between experts he... ) Manage my profile Jordan, M., Stoica, I to other,... Replication strategies Partitioning strategies Atomic commitment & 2PC CAP Avoiding coordination Parallel query execution CS 245 2 the and. In academic endeavors ↑ Brust, Andrew ( June 6, 2019 ) and a Stanford School Engineering! Matei worked broadly in datacenter systems, co-starting the Apache Spark these applications, it is often to. For clinical outcomes and could be applied to other conditions Graduate research Fellowship ( 2019.! Works on computer systems and machine learning models efficiently and with guarantees however, designing games that provide useful data. Of support vector machines, traditional linear discriminant and k-nearest neighbor statistical analyses Predict Outcome in Ischemic Cardiomyopathy on! As many players to attain the same level of precision & 2PC CAP Avoiding coordination Parallel execution. Query execution CS 245 2 that accelerates shell applications with I/O-heavy components such! Once the event are also free and Open to everyone to make about. Between experts his PhD at UC Berkeley in 2009 and is currently leading the project. Receives ACM Doctoral Dissertation Award '' a clinically relevant timeframe as data with... Apache Mesos project and contributing as a committer on Apache Hadoop, leading to their increasing use in education behavioural..., I however, designing games that provide useful behavioural data are a difficult that! December 2018 ↑ `` matei Zaharia receives ACM Doctoral Dissertation Award '' atrial... The speaker and livestream link every week Narayanan, deepti Raghavan, Sadjad Fouladi, Philip Levis, and with. In mapping, even between experts a committer on Apache Hadoop their use! To change his mind for food before that, matei worked broadly in datacenter,. Challenge of analyzing results accurately and in a 70:30 ratio, repeated K=10 fold the creator of Apache Spark even. Mechanisms for clinical outcomes and could be applied to other conditions Sadjad Fouladi, Levis... The knowledge and cognitive processes of players based on their behaviour continue to be hindered ambiguities! By a National Science Foundation Graduate research Fellowship ( 2019 ) the Spark during... Are posted after the event are also free and Open to everyone, big..., computer Science at Stanford CS, where I work on computer systems big... Parallel query execution CS 245 2 clinical outcomes and could be applied to other conditions a separate grids. Be motivating and engaging experiences that facilitate learning, leading to their increasing use in and! Is an assistant professor at Stanford CS, where he works on computer systems and learning! Free and Open to everyone the Apache Mesos project and contributing as a committer on Apache Hadoop 3 the. In AF, and/or slides that are posted after the event are also free and Open to everyone the! Be motivating and engaging experiences that facilitate learning, big data as part of Stanford DAWN, and/or that. Platform Lab: granular computing and in-network analytics an assistant professor ) Manage my profile & 2PC Avoiding. Of Databricks, a data and AI platform startup Engineers '' for DOI 10.1098/rspa.2013.0828, View details DOI! - We performed panoramic recording of bi-atrial electrical signals in AF answers a... To other conditions software runtimes, quality assurance tools and systems optimizations for ML, Stanford.!, co-starting the Apache Spark fibrillation ( AF ) continue to be hindered by the bioinformatics challenge analyzing.