An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.
Перейти к файлу
Chungmin Lee 9735b57be5
Data Skipping Index Part 3-2: Rule (#482)
2021-08-31 16:28:01 -07:00
.github Use Issues for Writing Design Proposals & Move Away from Design Proposal PRs (#340) 2021-01-28 16:16:20 -08:00
build Fix sbt-launch-lib.bash (#436) 2021-04-28 19:47:58 -07:00
ci Add windows build pipeline (#470) 2021-06-24 10:16:32 -07:00
dev Reformat files with scalafmt (#473) 2021-06-28 14:15:38 -07:00
docs Documentation: coding style setting with examples (#471) 2021-06-25 10:30:40 -07:00
examples Update version info in docs, etc. after releasing 0.2.0 (#114) 2020-08-05 17:07:36 -07:00
notebooks Add a notebook for Delta Lake support (#447) 2021-05-26 13:17:57 -07:00
project Data Skipping Index Part 3-1: Utils (#491) 2021-08-23 13:10:25 -07:00
python Refactoring for an extensible Index API: Part 3 (#475) 2021-07-05 17:43:47 +09:00
script Python bindings for Hyperspace (#36) 2020-08-05 11:33:02 -07:00
spark2.4 Cross library build for Spark 2/3 (#421) 2021-04-27 23:14:52 -07:00
spark3.0 Cross library build for Spark 2/3 (#421) 2021-04-27 23:14:52 -07:00
spark3.1 Support Spark 3.1.1 (#434) 2021-05-07 13:15:38 -07:00
src Data Skipping Index Part 3-2: Rule (#482) 2021-08-31 16:28:01 -07:00
.gitignore Support Spark 3.1.1 (#434) 2021-05-07 13:15:38 -07:00
CODE_OF_CONDUCT.md Initial CODE_OF_CONDUCT.md commit 2020-05-07 11:16:42 -07:00
CONTRIBUTING.md Update CONTRIBUTING.md for PR submission guide for a major feature (#378) 2021-04-22 16:42:03 -07:00
LICENSE Update LICENSE. 2020-06-15 11:16:40 -07:00
NOTICE.txt Add NOTICE.txt 2020-06-15 11:50:52 -07:00
README.md Documentation: coding style setting with examples (#471) 2021-06-25 10:30:40 -07:00
ROADMAP.md Updates to Readme and Roadmap (#48) 2020-06-24 10:04:48 -07:00
SECURITY.md Initial SECURITY.md commit 2020-05-07 11:16:46 -07:00
azure-pipelines.yml Disable Spark 3.1 on Windows build (#485) 2021-07-28 22:55:16 -07:00
build.sbt Data Skipping Index Part 3-2: Rule (#482) 2021-08-31 16:28:01 -07:00
run-tests.py Fix azure-pipelines.yml to use ubuntu-18.04 (#371) 2021-03-02 19:20:13 -08:00
scalastyle-config.xml Update scalastyle to allow 2021 in the copyright notice (#396) 2021-04-02 08:42:12 -07:00
version.sbt Cross library build for Spark 2/3 (#421) 2021-04-27 23:14:52 -07:00

README.md

Icon

Hyperspace

An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.

aka.ms/hyperspace

Build Status javadoc

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

Please review our contribution guide.

Development on Windows

This repository contains symbolic links which don't work properly on Windows. To build this project on Windows, you can use our provided Git aliases to replace symbolic links with junctions.

$ git config --local include.path ../dev/.gitconfig
$ git replace-symlinks # replace symlinks with junctions
$ #git restore-symlinks # use this to restore symlinks if you need

Using IntelliJ

You can use the built-in sbt shell in IntelliJ without any problems. However, the built-in "Build Project" command may not work. To fix the issue, go to Project Structure -> Project Settings -> Modules and follow these steps:

  • Mark src/main/scala and src/main/scala-spark2 as "Sources" and src/test/scala and src/test/scala-spark2 as "Tests" for the spark2_4 module.
  • Mark src/main/scala and src/main/scala-spark3 as "Sources" and src/test/scala and src/test/scala-spark3 as "Tests" for the spark3_0 module.
  • Remove the root and hyperspace-sources modules.
  • An example of Project Structure

Additionally, you might have to run sbt buildInfo if you encounter an error like object BuildInfo is not a member of package com.microsoft.hyperspace for the first build.

Inspiration and Special Thanks

This project would not have been possible without the outstanding work from the following communities:

  • Apache Spark: Unified Analytics Engine for Big Data, the engine that Hyperspace builds on top of.
  • Delta Lake: Delta Lake is an open-source storage layer that brings ACID transactions to Apache Spark™ and big data workloads. Hyperspace derives quite a bit of inspiration from the way the Delta Lake community operates and pioneering of some surrounding ideas in the context of data lakes (e.g., their novel use of optimistic concurrency).
  • Databricks: Unified analytics platform. Many thanks to all the inspiration they have provided us.
  • .NET for Apache Spark™: Hyperspace offers .NET bindings for developers, thanks to the efforts of this team in collaborating and releasing the bindings just-in-time.
  • Minimal Mistakes: The awesome theme behind Hyperspace documentation.

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

License

Apache License 2.0, see LICENSE.