spark2014
SPARK 2014 is the new version of SPARK, a software development technology specif
Ada217gpl-3.0
6 months ago
spark
Apache Spark - A unified analytics engine for large-scale data processing
Scala36822apache-2.0
8 months ago
big-datajavajdbc
awesome-spark
A curated list of awesome Apache Spark packages and resources.
Shell1603cc0-1.0
last year
apache-sparkawesomepyspark
spark-gotchas
Spark Gotchas. A subjective compilation of the Apache Spark tips and tricks
355other
7 years ago
apache-sparkbookguide
spark-sklearn
(Deprecated) Scikit-learn integration package for Apache Spark
Python1077apache-2.0
4 years ago
apache-sparkgrid-searchmachine-learning
spark-cassandra-connector
DataStax Connector for Apache Spark to Apache Cassandra
Scala1930apache-2.0
3 months ago
cassandrascalaspark
spark-cassandra-stress
A tool for testing the DataStax Spark Connector against Apache Cassandra or DSE
Scala25apache-2.0
last year
spark
.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers
C#1999mit
26 days ago
analyticsapache-sparkazure
shc
The Apache Spark - Apache HBase Connector is a library to support Spark accessin
Scala551apache-2.0
3 years ago
dbscan-on-spark
An implementation of DBSCAN runing on top of Apache Spark
Scala181apache-2.0
6 years ago
spark-nlp
State of the Art Natural Language Processing
Scala3495apache-2.0
6 months ago
albertbertentity-extraction
jpmml-evaluator-spark
PMML evaluator library for the Apache Spark cluster computing system (http://spa
Java94agpl-3.0
2 years ago
EMR_Spark_Automation
A repository for deploying an AWS EMR cluster and submiting spark jobs on it. Bo
Python8apache-2.0
7 years ago
kotlin-spark-api
This projects gives Kotlin bindings and several extensions for Apache Spark. We
Kotlin441apache-2.0
last month
bigdatakotlinnullability
spark-fast-tests
Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and
Scala409mit
4 months ago
sparktesting-framework
neo4j-spark-connector
Neo4j Connector for Apache Spark, which provides bi-directional read/write acces
Scala301apache-2.0
3 months ago
boltcypherhacktoberfest
spark-notebook
Interactive and Reactive Data Science using Scala and Spark.
JavaScript3148apache-2.0
12 months ago
apache-sparkdata-sciencenotebook
deep-spark
Connecting Apache Spark with different data stores [DEPRECATED]
Java197apache-2.0
8 years ago
spark-by-example
SPARK by Example is an adaptation of ACSL by Example for SPARK 2014, a programmi
Ada145
2 years ago
adaformal-methodsformal-specification
spark-riak-connector
The official Riak Spark Connector for Apache Spark with Riak TS and Riak KV
Scala60apache-2.0
7 years ago
gatling-sql
Gatling Extension for JDBC or Spark Thrift Server stress tests
Scala6apache-2.0
3 years ago
gatlingjdbcstress-testing
sample-SparkJobserverCassandra
Simple sample job illustrating the use of Spark Jobserver to execute Apache Spar
Scala2apache-2.0
8 years ago
netapp-public
delight
A Spark UI and Spark History Server alternative with CPU and Memory metrics! Del
Scala332other
last year
apache-sparkcpudashboard
twut
An open-source toolkit for analyzing line-oriented JSON Twitter archives with Ap
Scala9apache-2.0
last year
apache-sparksparkspark-packages
gneiss
Framework for platform-independent SPARK components
Ada22agpl-3.0
4 years ago
adacomponent-basedembedded
libsparkcrypto
A cryptographic library in SPARK 2014
Ada26
3 years ago
crypto-libraryformal-verification
libkeccak
SHA-3 and other Keccak related algorithms in SPARK/Ada.
Ada31bsd-3-clause
7 months ago
adaasconcshake
tensorframes
[DEPRECATED] Tensorflow wrapper for DataFrames on Apache Spark
Scala753apache-2.0
last year
CuBit
General-purpose, formally-verified, 64-bit operating system in SPARK/Ada for x86
Ada75gpl-3.0
3 years ago
adaosspark
magellan
Geo Spatial Data Analytics on Spark
Scala531apache-2.0
3 years ago
big-datageojsongeometric-algorithms
ArchiveSpark
An Apache Spark framework for easy data processing, extraction as well as deriva
Scala140mit
3 months ago
archivesparkinternet-archivespark
sparkplug
Spark package to "plug" holes in data using SQL based rules ⚡️ 🔌
Scala28apache-2.0
4 years ago
datapipelinesparkspark-sql
SPARK_NORX
An Ada 2012 / SPARK 2014 project that implements the NORX authenticated encrypti
Ada8isc
6 years ago
SPARK_SipHash
An Ada 2012 / SPARK 2014 project that implements the SipHash keyed hash function
Ada5other
6 years ago
sparkmagic
Jupyter magics and kernels for working with remote Spark clusters
Python1287other
6 days ago
clusterjupyterjupyter-notebook
Mobius
C# and F# language binding and extensions to Apache Spark
C#938mit
3 months ago
apache-sparkbigdatacsharp
flintrock
A command-line tool for launching Apache Spark clusters.
Python631apache-2.0
3 months ago
apache-sparkapache-spark-clusterec2
sparta
Real Time Analytics and Data Pipelines based on Spark Streaming
Scala525apache-2.0
5 years ago
analyticshdfskafka
benchm-ml
A minimal benchmark for scalability, speed and accuracy of commonly used open so
R1862mit
2 years ago
data-sciencedeep-learninggradient-boosting-machine
pyspark-stubs
Apache (Py)Spark type annotations (stub files).
Python114apache-2.0
2 years ago
apache-sparkmypypep484
geni
A Clojure dataframe library that runs on Spark
Clojure272apache-2.0
5 months ago
big-dataclojureclojure-library
Clustering4Ever
C4E, a JVM friendly library written in Scala for both local and distributed (Sp
Scala128apache-2.0
3 years ago
aiartificial-intelligencebig-data
spindle
Next-generation web analytics processing with Scala, Spark, and Parquet.
JavaScript333apache-2.0
9 years ago
wildfire
🔥From a little spark may burst a flame.
CSS178gpl-3.0
5 years ago
comment-plugincommentsfirebase
xgboost
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library
C++25350apache-2.0
2 months ago
distributed-systemsgbdtgbm
sparkling
A Clojure library for Apache Spark: fast, fully-features, and developer friendly
Clojure443epl-1.0
2 years ago
elephas
Distributed Deep learning with Keras & Spark
Python1569mit
last year
deep-learningdistributed-computingkeras
ngx-trend
📈 Simple, elegant spark lines for Angular
TypeScript116mit
2 years ago
angularangular2ngmodule
osm4scala
Scala and Spark library focused on reading OpenStreetMap Pbf files.
Scala74mit
9 months ago
gisopenstreetmapopenstreetmap-pbf-files
crossdata
DISCONTINUED - Easy access to big things. Library for Apache Spark extending and
Scala169apache-2.0
4 years ago
pysparkling
A pure Python implementation of Apache Spark's RDD and DStream interfaces.
Python261other
last year
apache-sparkdata-processingdata-science
strong-together
A starter project to build single page Vue.js apps as stand-alone or for Laravel
CSS89mit
7 years ago
itachi
A library that brings useful functions from various modern database management s
Scala52apache-2.0
8 months ago
hivepostgrespresto
deequ
Deequ is a library built on top of Apache Spark for defining "unit tests for dat
Scala3068apache-2.0
3 months ago
dataqualityscalaspark
glow
Glow is an easy-to-use distributed computation system written in Go, similar to
Go3179
6 years ago
DW1000
A SPARK/Ada driver for the DecaWave DW1000 Ultra-Wideband tranceiver.
Ada11mit
5 years ago
decawavedw1000ranging-sensor
koalas
Koalas: pandas API on Apache Spark
Python3310apache-2.0
8 months ago
big-datadata-sciencedataframe
delta
An open-source storage framework that enables building a Lakehouse architecture
HTML6708apache-2.0
3 months ago
acidanalyticsbig-data
sample-KafkaSparkCassandra
Introductory sample scala app using Apache Spark Streaming to accept data from K
Scala23
5 years ago
netapp-public
LiFT
The LinkedIn Fairness Toolkit (LiFT) is a Scala/Spark library that enables the m
Scala167bsd-2-clause
last year
fairnessfairness-aifairness-ml
awesome-ada
A curated list of awesome resources related to the Ada and SPARK programming lan
563cc0-1.0
last month
adaada-bindingada-framework
RoaringBitmap
A better compressed bitset in Java: used by Apache Spark, Netflix Atlas, Tablesa
Java3325apache-2.0
3 months ago
bitsetdruidjava
scylla-migrator
Migrate data extract using Spark to Scylla, normally from Cassandra
Scala51apache-2.0
6 days ago
streaming-benchmarks
Benchmarks for Low Latency (Streaming) solutions including Apache Storm, Apache
Jupyter Notebook623apache-2.0
5 months ago
benchmarkslow-latencystreaming
TensorFlowOnSpark
TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.
Python3861apache-2.0
10 months ago
clusterfeaturedmachine-learning
kafka-sparkstreaming-cassandra
Docker container for Kafka - Spark Streaming - Cassandra
Jupyter Notebook97
5 years ago
neo4j-mazerunner
Mazerunner extends a Neo4j graph database to run scheduled big data graph comput
Java378apache-2.0
last year
incubator-livy
Apache Livy is an open source REST interface for interacting with Apache Spark f
Scala856apache-2.0
21 days ago
apachelivybigdatalivy
dist-keras
Distributed Deep Learning, with a focus on distributed training, using Keras and
Python624gpl-3.0
6 years ago
apache-sparkdata-parallelismdata-science
livy
Livy is an open source REST interface for interacting with Apache Spark from any
Scala1004
2 years ago
hamilton
Your single tool to express data, ML, and LLM pipelines with simple python funct
Jupyter Notebook1063bsd-3-clause-clear
5 months ago
dagdata-analysisdata-engineering
sparkling-water
Sparkling Water provides H2O functionality inside Spark cluster
Scala957apache-2.0
5 months ago
big-datah2ointegration
vue-info-card
Simple and beautiful card component with an elegant spark line, for VueJS.
JavaScript189mit
last year
cardcard-componentcomponent
oryx
Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large sc
Java1790apache-2.0
3 years ago
apache-kafkaapache-sparkcloudera
ydata-profiling
1 Line of code data quality profiling & exploratory data analysis for Pandas and
Python11847mit
2 months ago
big-data-analyticsdata-analysisdata-exploration
adam
ADAM is a genomics analysis platform with specialized file formats built using A
Scala964apache-2.0
4 months ago
avrobig-databioinformatics
data-science-ipython-notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras),
Python25938other
7 months ago
awsbig-datacaffe
dev-setup
macOS development environment setup: Easy-to-understand instructions with autom
Python6014other
last year
android-developmentawsbash
clj-kondo
Static analyzer and linter for Clojure code that sparks joy
Clojure1619epl-1.0
6 months ago
clojureclojurescriptgraalvm
pyspark-cheatsheet
🐍 Quick reference guide to common patterns & functions in PySpark.
301mit
last year
cheatcheatsheetcheatsheets
Kimera-Semantics
Real-Time 3D Semantic Reconstruction from 2D data
C++610bsd-2-clause
5 months ago
3d-reconstructioncpudepth-image
lang
List of 126 languages for Laravel Framework, Laravel Jetstream, Laravel Fortify,
PHP7262mit
4 months ago
languagelaravellaravel-application
spark2014
SPARK 2014 is the new version of SPARK, a software development technology specif
Ada217gpl-3.0
6 months ago
spark
Apache Spark - A unified analytics engine for large-scale data processing
Scala36822apache-2.0
8 months ago
big-datajavajdbc
awesome-spark
A curated list of awesome Apache Spark packages and resources.
Shell1603cc0-1.0
last year
apache-sparkawesomepyspark
spark-gotchas
Spark Gotchas. A subjective compilation of the Apache Spark tips and tricks
355other
7 years ago
apache-sparkbookguide
spark-sklearn
(Deprecated) Scikit-learn integration package for Apache Spark
Python1077apache-2.0
4 years ago
apache-sparkgrid-searchmachine-learning
spark-cassandra-connector
DataStax Connector for Apache Spark to Apache Cassandra
Scala1930apache-2.0
3 months ago
cassandrascalaspark
spark-cassandra-stress
A tool for testing the DataStax Spark Connector against Apache Cassandra or DSE
Scala25apache-2.0
last year
spark
.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers
C#1999mit
26 days ago
analyticsapache-sparkazure
shc
The Apache Spark - Apache HBase Connector is a library to support Spark accessin
Scala551apache-2.0
3 years ago
dbscan-on-spark
An implementation of DBSCAN runing on top of Apache Spark
Scala181apache-2.0
6 years ago
spark-nlp
State of the Art Natural Language Processing
Scala3495apache-2.0
6 months ago
albertbertentity-extraction
jpmml-evaluator-spark
PMML evaluator library for the Apache Spark cluster computing system (http://spa
Java94agpl-3.0
2 years ago
EMR_Spark_Automation
A repository for deploying an AWS EMR cluster and submiting spark jobs on it. Bo
Python8apache-2.0
7 years ago
kotlin-spark-api
This projects gives Kotlin bindings and several extensions for Apache Spark. We
Kotlin441apache-2.0
last month
bigdatakotlinnullability
spark-fast-tests
Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and
Scala409mit
4 months ago
sparktesting-framework
neo4j-spark-connector
Neo4j Connector for Apache Spark, which provides bi-directional read/write acces
Scala301apache-2.0
3 months ago
boltcypherhacktoberfest
spark-notebook
Interactive and Reactive Data Science using Scala and Spark.
JavaScript3148apache-2.0
12 months ago
apache-sparkdata-sciencenotebook
deep-spark
Connecting Apache Spark with different data stores [DEPRECATED]
Java197apache-2.0
8 years ago
spark-by-example
SPARK by Example is an adaptation of ACSL by Example for SPARK 2014, a programmi
Ada145
2 years ago
adaformal-methodsformal-specification
spark-riak-connector
The official Riak Spark Connector for Apache Spark with Riak TS and Riak KV
Scala60apache-2.0
7 years ago
sample-SparkJobserverCassandra
Simple sample job illustrating the use of Spark Jobserver to execute Apache Spar
Scala2apache-2.0
8 years ago
netapp-public
ada_language_server
Server implementing the Microsoft Language Protocol for Ada and SPARK
Ada205gpl-3.0
5 months ago
twut
An open-source toolkit for analyzing line-oriented JSON Twitter archives with Ap
Scala9apache-2.0
last year
apache-sparksparkspark-packages
gneiss
Framework for platform-independent SPARK components
Ada22agpl-3.0
4 years ago
adacomponent-basedembedded
libsparkcrypto
A cryptographic library in SPARK 2014
Ada26
3 years ago
crypto-libraryformal-verification
libkeccak
SHA-3 and other Keccak related algorithms in SPARK/Ada.
Ada31bsd-3-clause
7 months ago
adaasconcshake
tensorframes
[DEPRECATED] Tensorflow wrapper for DataFrames on Apache Spark
Scala753apache-2.0
last year
CuBit
General-purpose, formally-verified, 64-bit operating system in SPARK/Ada for x86
Ada75gpl-3.0
3 years ago
adaosspark
magellan
Geo Spatial Data Analytics on Spark
Scala531apache-2.0
3 years ago
big-datageojsongeometric-algorithms
ArchiveSpark
An Apache Spark framework for easy data processing, extraction as well as deriva
Scala140mit
3 months ago
archivesparkinternet-archivespark
sparkplug
Spark package to "plug" holes in data using SQL based rules ⚡️ 🔌
Scala28apache-2.0
4 years ago
datapipelinesparkspark-sql
SPARK_NORX
An Ada 2012 / SPARK 2014 project that implements the NORX authenticated encrypti
Ada8isc
6 years ago
SPARK_SipHash
An Ada 2012 / SPARK 2014 project that implements the SipHash keyed hash function
Ada5other
6 years ago
sparkmagic
Jupyter magics and kernels for working with remote Spark clusters
Python1287other
6 days ago
clusterjupyterjupyter-notebook
Mobius
C# and F# language binding and extensions to Apache Spark
C#938mit
3 months ago
apache-sparkbigdatacsharp
flintrock
A command-line tool for launching Apache Spark clusters.
Python631apache-2.0
3 months ago
apache-sparkapache-spark-clusterec2
sparta
Real Time Analytics and Data Pipelines based on Spark Streaming
Scala525apache-2.0
5 years ago
analyticshdfskafka
benchm-ml
A minimal benchmark for scalability, speed and accuracy of commonly used open so
R1862mit
2 years ago
data-sciencedeep-learninggradient-boosting-machine
pyspark-stubs
Apache (Py)Spark type annotations (stub files).
Python114apache-2.0
2 years ago
apache-sparkmypypep484
geni
A Clojure dataframe library that runs on Spark
Clojure272apache-2.0
5 months ago
big-dataclojureclojure-library
Clustering4Ever
C4E, a JVM friendly library written in Scala for both local and distributed (Sp
Scala128apache-2.0
3 years ago
aiartificial-intelligencebig-data
spindle
Next-generation web analytics processing with Scala, Spark, and Parquet.
JavaScript333apache-2.0
9 years ago
wildfire
🔥From a little spark may burst a flame.
CSS178gpl-3.0
5 years ago
comment-plugincommentsfirebase
xgboost
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library
C++25350apache-2.0
2 months ago
distributed-systemsgbdtgbm
sparkling
A Clojure library for Apache Spark: fast, fully-features, and developer friendly
Clojure443epl-1.0
2 years ago
elephas
Distributed Deep learning with Keras & Spark
Python1569mit
last year
deep-learningdistributed-computingkeras
ngx-trend
📈 Simple, elegant spark lines for Angular
TypeScript116mit
2 years ago
angularangular2ngmodule
osm4scala
Scala and Spark library focused on reading OpenStreetMap Pbf files.
Scala74mit
9 months ago
gisopenstreetmapopenstreetmap-pbf-files
crossdata
DISCONTINUED - Easy access to big things. Library for Apache Spark extending and
Scala169apache-2.0
4 years ago
pysparkling
A pure Python implementation of Apache Spark's RDD and DStream interfaces.
Python261other
last year
apache-sparkdata-processingdata-science
strong-together
A starter project to build single page Vue.js apps as stand-alone or for Laravel
CSS89mit
7 years ago
itachi
A library that brings useful functions from various modern database management s
Scala52apache-2.0
8 months ago
hivepostgrespresto
deequ
Deequ is a library built on top of Apache Spark for defining "unit tests for dat
Scala3068apache-2.0
3 months ago
dataqualityscalaspark
glow
Glow is an easy-to-use distributed computation system written in Go, similar to
Go3179
6 years ago
DW1000
A SPARK/Ada driver for the DecaWave DW1000 Ultra-Wideband tranceiver.
Ada11mit
5 years ago
decawavedw1000ranging-sensor
koalas
Koalas: pandas API on Apache Spark
Python3310apache-2.0
8 months ago
big-datadata-sciencedataframe
delta
An open-source storage framework that enables building a Lakehouse architecture
HTML6708apache-2.0
3 months ago
acidanalyticsbig-data
sample-KafkaSparkCassandra
Introductory sample scala app using Apache Spark Streaming to accept data from K
Scala23
5 years ago
netapp-public
LiFT
The LinkedIn Fairness Toolkit (LiFT) is a Scala/Spark library that enables the m
Scala167bsd-2-clause
last year
fairnessfairness-aifairness-ml
awesome-ada
A curated list of awesome resources related to the Ada and SPARK programming lan
563cc0-1.0
last month
adaada-bindingada-framework
RoaringBitmap
A better compressed bitset in Java: used by Apache Spark, Netflix Atlas, Tablesa
Java3325apache-2.0
3 months ago
bitsetdruidjava
scylla-migrator
Migrate data extract using Spark to Scylla, normally from Cassandra
Scala51apache-2.0
6 days ago
streaming-benchmarks
Benchmarks for Low Latency (Streaming) solutions including Apache Storm, Apache
Jupyter Notebook623apache-2.0
5 months ago
benchmarkslow-latencystreaming
TensorFlowOnSpark
TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.
Python3861apache-2.0
10 months ago
clusterfeaturedmachine-learning
kafka-sparkstreaming-cassandra
Docker container for Kafka - Spark Streaming - Cassandra
Jupyter Notebook97
5 years ago
neo4j-mazerunner
Mazerunner extends a Neo4j graph database to run scheduled big data graph comput
Java378apache-2.0
last year
incubator-livy
Apache Livy is an open source REST interface for interacting with Apache Spark f
Scala856apache-2.0
21 days ago
apachelivybigdatalivy
dist-keras
Distributed Deep Learning, with a focus on distributed training, using Keras and
Python624gpl-3.0
6 years ago
apache-sparkdata-parallelismdata-science
livy
Livy is an open source REST interface for interacting with Apache Spark from any
Scala1004
2 years ago
hamilton
Your single tool to express data, ML, and LLM pipelines with simple python funct
Jupyter Notebook1063bsd-3-clause-clear
5 months ago
dagdata-analysisdata-engineering
sparkling-water
Sparkling Water provides H2O functionality inside Spark cluster
Scala957apache-2.0
5 months ago
big-datah2ointegration
vue-info-card
Simple and beautiful card component with an elegant spark line, for VueJS.
JavaScript189mit
last year
cardcard-componentcomponent
oryx
Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large sc
Java1790apache-2.0
3 years ago
apache-kafkaapache-sparkcloudera
ydata-profiling
1 Line of code data quality profiling & exploratory data analysis for Pandas and
Python11847mit
2 months ago
big-data-analyticsdata-analysisdata-exploration
adam
ADAM is a genomics analysis platform with specialized file formats built using A
Scala964apache-2.0
4 months ago
avrobig-databioinformatics
data-science-ipython-notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras),
Python25938other
7 months ago
awsbig-datacaffe
dev-setup
macOS development environment setup: Easy-to-understand instructions with autom
Python6014other
last year
android-developmentawsbash
clj-kondo
Static analyzer and linter for Clojure code that sparks joy
Clojure1619epl-1.0
6 months ago
clojureclojurescriptgraalvm
pyspark-cheatsheet
🐍 Quick reference guide to common patterns & functions in PySpark.
301mit
last year
cheatcheatsheetcheatsheets
Kimera-Semantics
Real-Time 3D Semantic Reconstruction from 2D data
C++610bsd-2-clause
5 months ago
3d-reconstructioncpudepth-image
lang
List of 126 languages for Laravel Framework, Laravel Jetstream, Laravel Fortify,
PHP7262mit
4 months ago
languagelaravellaravel-application