Граф коммитов

116 Коммитов

Автор SHA1 Сообщение Дата
Chenhui Hu 4e3c7e6b9a added scala notebooks that use data source connector 2019-12-10 15:51:43 +00:00
Markus Cozowicz e9f45faca4 Merge branch 'master' of https://github.com/microsoft/Accumulo 2019-12-06 21:17:07 +01:00
Markus Cozowicz 4b58fa0a56 added missing root properties to pom.xml 2019-12-06 21:16:59 +01:00
Markus Cozowicz 1823b8c62b
Publish parent too 2019-12-06 20:49:07 +01:00
Markus Cozowicz e9b03e0788
Updated package names 2019-12-06 15:13:21 +01:00
Markus Cozowicz 3dc2dd3800
Adjusted branding (#35)
* renamed package
changed name

* adjusted parent link

* drop old zipfs

* switch github action to 1.8

* fix CI/CD jar name
2019-12-05 20:42:50 +01:00
Markus Cozowicz 1f4cef2f71
added badges 2019-12-04 22:21:45 +01:00
Markus Cozowicz 9220cfebd7
Added maven central badges 2019-12-04 22:21:18 +01:00
Markus Cozowicz 0f6207da39
update package name 2019-12-04 22:18:33 +01:00
Markus Cozowicz f6eff33da1
Update README.md 2019-12-04 22:16:52 +01:00
Markus Cozowicz 8a4a961d08 skip null keys and values 2019-12-04 14:30:13 +01:00
Markus Cozowicz 099913f114 removed unsupported javadoc tag 2019-12-04 12:54:22 +01:00
Markus Cozowicz c216426581 select jdk 8 for scala tro build 2019-12-04 12:43:33 +01:00
Markus Cozowicz 79b4816dba added scala api doc 2019-12-04 12:20:30 +01:00
Markus Cozowicz eaa6a371b4 added signing, sources and doc 2019-12-04 12:15:26 +01:00
Markus Cozowicz bbfc821554 make publish.sh executable 2019-12-04 11:39:26 +01:00
Markus Cozowicz 18d03e29c1 added publishing scripts 2019-12-04 11:32:10 +01:00
Markus Cozowicz 5a709594c4
Publish to Sonatype (#33)
* public sonatype release

* invoke publishing script from devops

* only overwrite main jar with shaded one for publish
fixed issue #32 (updated package name)

* updated artifact publishing path
2019-12-04 11:14:46 +01:00
Markus Cozowicz 5dee2eb9b2
actually fix rowkey in schema bug (#30) 2019-12-03 18:38:10 +01:00
Markus Cozowicz 0984540538
Update README.md 2019-12-03 15:50:34 +01:00
Markus Cozowicz 55135e34fe
Column prune fix, MLeap caching (#28)
* use local classloader

* try different class loader

* add println

* add hack for zfs in jimfs

* Update MLeapUtil.scala

Try to add classloader...

* forcefully instantiate zip filesystem provider to overcome scala/spark behavior

* imported OpenJDK ZipFileSystem to overcome JDK dependency

* JDK 8 compat

* more JDK 8 compat

* more JDK 8 compat

* switched to packaged zipfs for iterator

* updated copyright notice as per CELA guidance

* don't re-use the zip file system

* added README for zipfs origin

* added debug print

* switch to log4j logging

* implemented mleap dataframe caching

* added log4j logging
added perf metrics

* changed the groupId to com.microsoft.masc

* package rename

* renamed to com.microsoft.accumulo
introduce guava based cache for MLeap model
try to fix column pruning

* fixed datasource schema passing

* fixed issue #27

* re-added accumulo client to datasource jar

* removed accumulo-core from iterator shaded jar

* added MLeap guid to support fast cache lookup

* fix MLeap model caching

* don't add empty splits (resulting in duplicate records)

* fixing #29 (duplicate rowkey column in schema)
2019-12-03 12:09:44 +01:00
Scott Graham b8d3fd53a6
adding prereq info to notebook, change file name (#25) 2019-11-19 10:05:32 -08:00
Markus Cozowicz 29b534e754
Package back ported ZipFileSystem from OpenJDK (#18)
* use local classloader

* try different class loader

* add println

* add hack for zfs in jimfs

* Update MLeapUtil.scala

Try to add classloader...

* forcefully instantiate zip filesystem provider to overcome scala/spark behavior

* imported OpenJDK ZipFileSystem to overcome JDK dependency

* JDK 8 compat

* more JDK 8 compat

* more JDK 8 compat

* switched to packaged zipfs for iterator

* updated copyright notice as per CELA guidance

* don't re-use the zip file system

* added README for zipfs origin

* added debug print

* switch to log4j logging

* implemented mleap dataframe caching

* added log4j logging
added perf metrics

* changed the groupId to com.microsoft.masc

* removed extensive performance logging
sharing MLeap transformer between iterator instances
disable column pruning test
2019-11-19 08:49:56 -08:00
Markus Cozowicz 3f72f93a2b
Merge pull request #23 from arvindshmicrosoft/master
Fix indentation issue
2019-11-19 08:40:51 -08:00
Arvind Shyamsundar ca9e4f71c9 Fix indentation issue 2019-11-18 22:45:34 -08:00
Scott Graham 57a9fa9c63
updating name (#22) 2019-11-19 01:01:33 -05:00
Scott Graham 4c930dd81c updating readme and cleaning up notebook 2019-11-15 16:30:34 -05:00
Scott Graham c461cf1d99
testing maven github action 2019-11-14 10:17:32 -05:00
Markus Cozowicz bb3391b14b
Merge pull request #17 from microsoft/marcozo/jimfsshading
exclude jimfs from shading (suspect reflection)
2019-11-07 14:56:00 -05:00
Markus Cozowicz 39007de1f1 exclude jimfs from shading (suspect reflection) 2019-11-07 14:53:31 -05:00
Markus Cozowicz 5c13f6c2ef Add BatchScanner.fetchColumnFamily and use in-memory filesystem for MLeap (#15)
* initial fetchColumnFamily implementation

* added validation and removed local file access

* dropped commons dependency
re-added parent

* removed unused import

* trigger build on PR
2019-11-05 21:54:56 -05:00
Scott Graham 1e84fef842
Adding links to trademarks in readme 2019-11-04 09:44:19 -05:00
Markus Cozowicz 5a886cb495
Merge pull request #14 from microsoft/gramhagen/readme
readmes
2019-11-01 15:39:18 -04:00
Scott Graham 2daa4628fd updating readme 2019-11-01 15:35:53 -04:00
Scott Graham c799b5ad5a adding / updating readmes 2019-11-01 15:30:24 -04:00
Markus Cozowicz a75afdb1d4
Merge pull request #13 from microsoft/gramhagen/connector
Focus contributions on spark
2019-10-29 13:58:17 -04:00
Scott Graham abab66369d updating pipelines yaml 2019-10-29 13:38:40 -04:00
Scott Graham 9f33fffef4 moving spark to connector, removing duplication iterator, moving main license file to apache 2019-10-29 09:52:05 -04:00
Scott Graham 138c129c08 updating status badge to point to new devops project 2019-10-17 15:15:02 -04:00
Markus Cozowicz f2c6c6e275 improve JDK 1.9 compat 2019-10-14 20:05:49 +02:00
Markus Cozowicz c94839aec4 replace Map.of() with new HashMap()... to be JDK 1.8 compatible 2019-10-14 19:50:27 +02:00
Markus Cozowicz 8eb6468944
Merge pull request #12 from microsoft/marcozo/dupv2
added duplication iterator
2019-10-11 17:10:05 +02:00
Markus Cozowicz 64842fc83f added duplication iterator 2019-10-11 17:01:03 +02:00
Markus Cozowicz 414aadc3b1 publish artifacts 2019-10-10 19:04:47 +02:00
Markus Cozowicz f8fd49a01e upgrade AVRO to hit compliance 2019-10-09 08:09:29 +02:00
Markus Cozowicz e6087cc2c7 switch to GSON to avoid Jackson version conflict and comply with governance 2019-10-08 20:28:29 +02:00
Markus Cozowicz 8f9d49c21d added build badge 2019-10-08 19:35:04 +02:00
Markus Cozowicz c5cf20fe49 Merge branch 'master' of https://github.com/microsoft/Accumulo 2019-10-08 19:29:36 +02:00
Markus Cozowicz 2e2af86830 reverted JSON version 2019-10-08 19:29:00 +02:00
Scott Graham e4e1f4efc3 adding accumulo spark connector notebook example 2019-10-08 13:24:46 -04:00