A few things I learnt at:


Open Data Science Conference

ODSC - WEST - 2015

why it's called "The Bay" area

what I remembered

but this time:

 

Brian Gainger, Jupyter

version 4.1 (Q4 2015)
  • Multi-cell selections and actions
  • notebook-wide find and replace
  • atom-style palette
Jupyter Workbench (Q2 2106)
  • gui rewrite: loosely coupled npm modules
  • flexible layout mixing components
  • notebooks, text editors, outputs, widgets and visu
  • 3rd party plug-ins
  • collaborative editing

Juliet Hougland, Cloudera

Top Tips for running PySpark
  • Apache Spark: fast and general engine for large-scale data processing
  • Pyspark: Python bindings for Apache Spark
  • Tips: use dataframes, limit data transfers between JVM and python processes
  • CloudPickle: serialize Python constructs not supported by the pickle module
Open Diversity Data
  • http://opendiversitydata.org/
  • was not mentionned at the conference, but...
  • "Companies already collect data about their employee demographics. Let's ask them to publish it."

Wes McKinney, Cloudera

After Pandas: Project Ibis
  • uniform Python based interface similar to Pandas, for big data queries
  • New project is easier then working with Pandas technical debt
  • Current python solutions for Big Data suffer from moving data between db and python
  • Ibis will have a shared memory architecture
  • first sqlite and impala drivers, next other dbs...

Anthony Goldbloom, Kaggle

Observing 400,000 data scientists working on ML competitions
  • structured datasets group focus on hand-crafting better features
  • unstructured datasets group instead build neural networks
  • Python is now the leading language, especially for neural networks
  • R ggplot library still has considerable popularity
new kaggle Scripts feature:
  • upload your code and run it in a container
  • must have open source Apache 2 license, and shared) and run it in a container
  • open for anyone to use, fork, or run in a kaggle container with differnet inputs

... next time SD

San Francisco 2015

  • 60 organisers/volunteers

  • 1000 participants

  • 200 from San Diego !

San Diego 2016




    ODSC SD 2016 ?

    another fun conference...

In [1]:
from IPython.core.display import HTML
def css_styling():
    styles = open("./styles/custom.css", "r").read()
    return HTML(styles)
css_styling()  
Out[1]:
In [2]:
# !jupyter nbconvert I_went_to_ODSC_WEST_2015.ipynb --to slides --post serve
In [ ]: