Граф коммитов

1358 Коммитов

Автор SHA1 Сообщение Дата
Michael Sharp d6d78fd68b
M1 helix testing (#7033)
* m1 helix testing

* enable m1 testing

* helix queue update

* fixed m1 baselines

* removed continue on error

* removing accidental file update

* removed wrong queue
2024-03-12 22:37:32 -07:00
Eric StJohn 86c11e1145
Update package versions in use by ML.NET tests (#7055)
Will backport this to the 3.0 branch too.

This also updates Newtonsoft which is a product dependency - but only
updates to the latest servicing release which we can do as a servicing
release ourselves.
2024-03-12 14:42:21 -07:00
Tarek Mahmoud Sayed bad82989a0
Adding needed Tokenizer's APIs (#7047)
* Adding needed Tokenizer's APIs

* Address the feedback

* Small update to the newly exposed APIs

* fix comments

* Update the APIs signatures

* More feedback addressing

* Fix the comments
2024-03-06 19:04:15 -08:00
Tarek Mahmoud Sayed 99c620ad96
Add Span support in tokenizer's Model abstraction (#7035)
* Add Span support in tokenizer's Model abstraction

* Address the feedback

* Use stackalloc instead of the ArrayPool
2024-02-29 18:01:13 -08:00
Eric StJohn f22b60aa9a
Packaging cleanup (#6939)
* Packaging cleanup

Originally I was just trying to remove mentions of snupkg, but then
things got a bit carried away. :)

This is trying to remove as much duplication and dead code related to
packaging that I can.

* Apply code review feedback

* Suppress copying indirect references

* Remove unwanted bundled files from AutoML

* Remove leading slash

* Refactor model download

* Correct the packaging path of native symbols

* Rename NoTargets projects from csproj to proj

* Fix build issues around model download and respond to feedback

* Remove NoTargets file extension enforcement

* Rename proj to CSProj, include in SLN

I'd like to ensure all our projects are included in the SLN and don't
rely on separate build steps.

VS prefers *.csproj in the sln so I renamed things back to csproj.

* Respond to PR feedback
2024-02-27 16:05:43 -08:00
Michael Sharp 3855dcafaa
make MlImage tests not block file for read (#7029) 2024-02-27 15:13:02 -07:00
Tarek Mahmoud Sayed d0aa2c2461
Address the feedback on the tokenizer's library (#7024)
* Fix cache when calling EncodeToIds

* Make EnglishRoberta _mergeRanks thread safe

* Delete Trainer

* Remove the setters on the Bpe properties

* Remove Roberta and Tiktoken special casing in the Tokenizer and support the cases in the Model abstraction

* Support text-embedding-3-small/large embedding

* Remove redundant TokenToId abstraction and keep the one with the extra parameters

* Enable creating Tiktoken asynchronously or directly using the tokenizer data

* Add cancellationToken support in CreateAsync APIs

* Rename sequence to text and Tokenize to Encode

* Rename skipSpecialTokens to considerSpecialTokens

* Rename TokenizerResult to EncodingResult

* Make Token publicly immutable

* Change offset tuples from (Index, End) to (Index, Length)

* Rename NormalizedString method's parameters

* Rename Model's methods to start with verb

* Convert  Model.GetVocab() method to a Vocab property

* Some method's parameters and variable renaming

* Remove Vocab and VocabSize from the abstraction

* Cleanup normalization support

* Minor Bpe cleanup

* Resolve rebase change

* Address the feedback
2024-02-26 10:57:18 -08:00
Michael Sharp 0d2fd603a2
Temp fix for the race condition during the tests. (#7021)
* temp fixing the race condition during the tests

* making db tests sequential
2024-02-21 13:14:45 -07:00
Michael Sharp 4d4f7dc1ec
Update OnnxRuntime to 1.16.3 (#6975)
* onnx update

* formatting update

* adding onnx runtime reference to automl tests
2024-02-21 11:58:38 -07:00
Tarek Mahmoud Sayed 4635a862dd
Tokenizer's Interfaces Cleanup (#7001)
* Tokenizer's Interfaces Cleanup

* Address the feedback

* Optimization
2024-02-16 12:48:38 -07:00
Tarek Mahmoud Sayed 6f55525602
Introducing Tiktoken Tokenizer (#6981)
* Introducing Tiktoken Tokenizer

* Address the feedback

* file renaming
2024-02-06 11:13:28 -08:00
Michael Sharp 54fa44fbf8
testing light gbm tests sequentially (#6968) 2024-01-23 10:32:40 -07:00
Michael Sharp b824ed2a09
Torch sharp version updates and test fixes (#6954)
* torch sharp version updates and test fixes

* removed extra test file

* job templat updated for test coverage
2024-01-16 20:59:17 -07:00
Michael Sharp 9f4a389891
Split out non concurrent test collections. (#6937)
* Split out non concurrent test collections.

* Fix Core Tests

* removed torchsharp and tensorflow tests when not x64

* fixes from pr comments

* removing unnecessary line from the proj file
2024-01-08 10:00:49 -07:00
Michael Sharp 8896dd2927
Fixes NER to correctly expand/shrink the labels (#6928)
* ner options fix

* Ner fixed.

* Update src/Microsoft.ML.Tokenizers/Model/EnglishRoberta.cs

Co-authored-by: Eric StJohn <ericstj@microsoft.com>

* fixes from PR comments

* fixed build

---------

Co-authored-by: Eric StJohn <ericstj@microsoft.com>
2024-01-05 17:52:17 -07:00
Eric StJohn 373a86467c
Only use semi-colons for NoWarn (#6935) 2024-01-02 18:02:13 -08:00
dotnet-maestro[bot] f625080a07
[main] Update dependencies from dotnet/arcade (#6703)
* Update dependencies from https://github.com/dotnet/arcade build 20230519.2

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 8.0.0-beta.23269.2

* Add dotnet8 nuget feed

* Update dependencies from https://github.com/dotnet/arcade build 20230529.1

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 8.0.0-beta.23279.1

* Update dependencies from https://github.com/dotnet/arcade build 20230602.3

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 8.0.0-beta.23302.3

* Update dependencies from https://github.com/dotnet/arcade build 20230609.8

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 8.0.0-beta.23309.8

* Update dependencies from https://github.com/dotnet/arcade build 20230616.6

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 8.0.0-beta.23316.6

* Update dependencies from https://github.com/dotnet/arcade build 20230622.2

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 8.0.0-beta.23322.2

* Update dependencies from https://github.com/dotnet/arcade build 20230630.1

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 8.0.0-beta.23330.1

* Update dependencies from https://github.com/dotnet/arcade build 20230710.1

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 8.0.0-beta.23360.1

* Update dependencies from https://github.com/dotnet/arcade build 20230714.2

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 8.0.0-beta.23364.2

* Update dependencies from https://github.com/dotnet/arcade build 20230721.1

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 8.0.0-beta.23371.1

* Update dependencies from https://github.com/dotnet/arcade build 20230728.2

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 8.0.0-beta.23378.2

* Update dependencies from https://github.com/dotnet/arcade build 20230804.2

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 8.0.0-beta.23404.2

* Update dependencies from https://github.com/dotnet/arcade build 20230811.1

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 8.0.0-beta.23411.1

* Update dependencies from https://github.com/dotnet/arcade build 20230819.1

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 8.0.0-beta.23419.1

* Update dependencies from https://github.com/dotnet/arcade build 20230825.2

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 8.0.0-beta.23425.2

* Update dependencies from https://github.com/dotnet/arcade build 20230901.1

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 8.0.0-beta.23451.1

* Update dependencies from https://github.com/dotnet/arcade build 20230901.1

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 8.0.0-beta.23451.1

* Update dependencies from https://github.com/dotnet/arcade build 20230913.1

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 8.0.0-beta.23463.1

* Update dependencies from https://github.com/dotnet/arcade build 20230913.1

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 8.0.0-beta.23463.1

* Update dependencies from https://github.com/dotnet/arcade build 20230913.1

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 8.0.0-beta.23463.1

* Update dependencies from https://github.com/dotnet/arcade build 20231008.1

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 9.0.0-beta.23508.1

* Update dependencies from https://github.com/dotnet/arcade build 20231010.4

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 9.0.0-beta.23510.4

* Update dependencies from https://github.com/dotnet/arcade build 20231018.2

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 9.0.0-beta.23518.2

* Update dependencies from https://github.com/dotnet/arcade build 20231028.2

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 9.0.0-beta.23528.2

* Update dependencies from https://github.com/dotnet/arcade build 20231103.1

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 9.0.0-beta.23553.1

* Update dependencies from https://github.com/dotnet/arcade build 20231110.1

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 9.0.0-beta.23560.1

* Update dependencies from https://github.com/dotnet/arcade build 20231117.1

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 9.0.0-beta.23567.1

* Fixed version update breaks.

* Update dependencies from https://github.com/dotnet/arcade build 20231122.2

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 9.0.0-beta.23572.2

* Update dependencies from https://github.com/dotnet/arcade build 20231201.1

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 9.0.0-beta.23601.1

* Update dependencies from https://github.com/dotnet/arcade build 20231207.2

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 9.0.0-beta.23607.2

* Update dependencies from https://github.com/dotnet/arcade build 20231215.2

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 9.0.0-beta.23615.2

* Update XUnitVersion

* Update MicrosoftMLOnnxRuntimeVersion to 1.16.3

* Rollback OnnxRuntime and suppress warning

* Update to Xunit with fix for https://github.com/xunit/xunit/issues/2821

* Ensure we pull down 8.0 runtime.

* Update Centos docker containers

* Fix packaging step

* Try including stdint.h to fix missing uint8_t on centos

* Update Centos test queue

* Attempt to use runtime centos-stream8-helix container for tests

* Use centos-stream8-mlnet-helix container for testing

* Undo changes to test data

* Make NETFRAMEWORK ifdef versionless

* Switch back to centos7 for testing

* Revert "Switch back to centos7 for testing"

This reverts commit ab0d41e4b7.

* Update dependencies from https://github.com/dotnet/arcade build 20231221.2

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 9.0.0-beta.23621.2

* Update dependencies from https://github.com/dotnet/arcade build 20231228.1

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 9.0.0-beta.23628.1

---------

Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>
Co-authored-by: Eric StJohn <ericstj@microsoft.com>
Co-authored-by: Michael Sharp <misharp@microsoft.com>
2024-01-02 15:46:00 -08:00
Eric StJohn d3c31274ce
Make double assertions compare with tolerance instead of precision (#6923)
Precision might cause small differences to round to a different number.
Instead compare with a tolerance which is not sensitive to rounding.
2023-12-22 12:18:29 -08:00
Eric StJohn a60be5f215
Rename NameEntity to NamedEntity (#6917) 2023-12-21 10:08:50 -08:00
Aleksei Smirnov efab0114ed
Reorganize dataframe files (#6872)
* Increase performance of elementwise comparison operations

* Fix Perf Test

* Reorganize files in the DataFrame related projects

* Fix merge issues
2023-12-12 10:04:47 -07:00
Michael Sharp db08da61b8
added in win-arm64 (#6813)
* added in win-arm64

* fixed MKL arm64 cmake issue

* right helix queue

* added in win-arm64

* fixed MKL arm64 cmake issue

* right helix queue

* fixing arm tests

* makes x64 test detection better

* change test label

* fixed onnx files not being included

* added new win-arm baselines

* baseline changes

* fixed build issue

* fixed test

* one more basleine

* fixed pack for mkl redist

* .NET update
2023-11-20 22:05:11 -07:00
Michael Sharp aeb1ab8f6b
Updates LightGBM from 2.X.X to 3.X.X (#6880)
* updated lightGBM version

* finished LightGBM baseline update

* removed accidental enums

* Fixed LightGBM Test

* fixed lightgbm

* fixed version

* fixed test with LightGBM

* fixed test result
2023-11-16 16:42:38 -07:00
Michael Sharp d2cf997d90
Changes some of the CPU Math implemenation from our current version to use the new TensorPrimitives package. (#6875)
* using tensor primitives

* added missing files

* some with indexes changed

* Initial swap for TensorPrimitives done

* Rebased and cleaned code

* more minor cleanup

* added system.numerics.tensors version to props

* build fixes

* added net6 again

* updates from PR comments

* fixed sumabsu

* fixed baseline tests

* test fixes

* fixed test failure for kmeans

* changed decimal comparison

* updated more baselines

* Test fixes.

* template update

* Test Fixes.

* fixed performance test csproj

* added baselines for linux arm/64

* fixed linux arm baselines

* fixed arm baselines

* removed extra files

* arm32 baselines updated

* fixed arm baselines
2023-11-14 22:46:15 -07:00
Aleksei Smirnov 796cb354fb
Improve performance of DataFrame binary comparison operations (#6869)
* Increase performance of elementwise comparison operations

* Fix Perf Test

* Fix code review findings
2023-10-22 22:23:12 -07:00
Aleksei Smirnov f7b8d560cd
Implement vectorized binary arithmetic operations (#6854)
* Remove separate enums for scalar operations

* Align implementation with .Net 8.0 Tensor  API

* Fix Modulo

* Implement vectorized Arithmetic binary operations

* Implement vectorized Arithmetics binary scalar operations
2023-10-16 14:59:49 -07:00
Aleksei Smirnov 9c183fc35b
Fix Saving csv with VBufferDataFrameColumn (#6860) 2023-10-13 22:23:43 -07:00
Aleksei Smirnov 64d7ebd093
Fixes incorrect work of DataFrame with VBufferColumn when number of e… (#6851)
* Fixes incorrect work of DataFrame with VBufferColumn when number of elements is greater than Int.MaxValue

* Fix calculation of max capacity and amount of required buffers

* Fix unit test

* Run test allocating more than 2 Gb of memory on 64bit env only

* Fix StringDataFrameColumn same way as VBufferDataFrameColumn

* Fix wrong amount of buffers created in constructor of StringDataFrameColumn

* Fix code review findings
2023-10-04 11:04:47 -07:00
Aleksei Smirnov 5cf6051db7
Increase performance of arithmetic operations by enhancing calculations on nullable values (#6846)
* Optimize PrimitiveColumnContainer.Clone method

* Avoid unnecessary type conversion during binary operations

* Remove using

* Fix DataFrameBuffer constructor

* remove uncorrectly added using

* Make DataFrameBuffer Length field protected

* Add performance tests

* Split Test for AppendMany into 4 different tests

* Block init of null validity buffer instead of setting individual bits

* Add unit tests for PrimitiveDataFrameColumn.Clone

* Fixes #6821

* Fix

* Add extra tests

* Fix

* Fix typo

* Fix Divide_Int16 and Divide_Int32_Int16 benchmarks

* Fix

* Avoid using constructor, that copies memory

* First step of tt refactoring

* Step 2

* Step 3

* Move iteration over buffers outside of the PrimitiveDataFrameColumnArithmetic

* Change PrimitiveDataFrameColumnArithmetic

* Fix typo

* Use RawSpan

* Fix bug with AppendMany values to not empty column

* Restart unit tests

* Add more unit tests

* Add GetBitCount method

* Fix failing unit test

* Implementation

* Change unit tests

* Update unit tests

* Refactoring BinaryOperation

* Intermediate changes

* Intermediate results

* Implement Binary Scalar Reverse Operarions

* Add implementation for BinaryIntOperations

* Implement Comparison Operations

* Implement actual calculations for Comparison operations

* Uncomment performance tests

* Remove unintentional code changes

* Add reference to Apache Arrow project license in THIRD-PARTY-NOTICES

* Fix license issues
2023-10-02 17:04:03 -07:00
Aleksei Smirnov 3c625bf542
6847 incorrectly sets column value (#6849)
* Fix DataFrame incorrectly sets column value for index higher than Buffer.MaxCapacity

* Revert renaming
2023-10-02 16:07:10 -07:00
Aleksei Smirnov 7fe293da31
PrimitiveDataFrameColumn.Clone method crashes when is used with IEnumerable mapIndices argument (#6822)
* Split Test for AppendMany into 4 different tests

* Block init of null validity buffer instead of setting individual bits

* Add unit tests for PrimitiveDataFrameColumn.Clone

* Fixes #6821

* Fix

* Fix bug with AppendMany values to not empty column

* Restart unit tests

* Add more unit tests

* Fix failing unit test

* Fix code review findings
2023-09-27 15:28:28 -07:00
Eric StJohn 97926a8c53
Update dependencies (#6837)
* Update dependencies

* Add reference to NuGet.Packaging.Core
2023-09-27 11:16:01 -07:00
Aleksei Smirnov 5648c89bbd
Improve performance of column cloning inside DataFrame arithmetics (#6814)
* Optimize PrimitiveColumnContainer.Clone method

* Avoid unnecessary type conversion during binary operations

* Remove using

* Fix DataFrameBuffer constructor

* remove uncorrectly added using

* Make DataFrameBuffer Length field protected

* Fix typo

* Use RawSpan
2023-09-26 18:28:39 -07:00
Aleksei Smirnov 15e6a556ce
Add performance benchmarks for dataframe arithmetic operations (#6827)
* Add performance tests

* Add extra tests

* Fix

* Fix typo

* Fix Divide_Int16 and Divide_Int32_Int16 benchmarks

* Fix

* Change csproj file

* Update BenchmarkDotNetVersion to 0.13.5

* Fix

* Change to 0.13.1 because that is what is latest version in our nuget feeds.

---------

Co-authored-by: Jake Radzikowski <JakeRad@Microsoft.com>
2023-09-26 10:23:39 -07:00
Xiaoyun Zhang a052146947
update interactive kernel version (#6836)
* update interactive kernel version

* update

* Update Microsoft.Data.Analysis.Interactive.Tests.csproj
2023-09-26 01:26:20 -07:00
Raffaello Fraboni 49824f3915
Fix wrong type conversion on PrimitiveDataFrameColumn (#6834)
* Fix wrong type conversion on PrimitiveDataFrameColumn

* Added tests for #6829

* Fix test

* Add file generated from tt template and fix unit tests

---------

Co-authored-by: Aleksei Smirnov <tlalok@inbox.ru>
2023-09-25 10:52:21 -07:00
Aleksei Smirnov d6927515d8
Append dataframe rows based on column names (#6808)
* Append dataframe rows based on column names

* Update DataFrame.cs

---------

Co-authored-by: Michael Sharp <51342856+michaelgsharp@users.noreply.github.com>
2023-09-01 22:06:55 -06:00
Aleksei Smirnov d9dbf99d97
Allow to define CultureInfo for parsing values on reading DataFrame from csv (#6782)
* Use CultureInfo for parsing values in csv file

* Fix merge issues
2023-08-31 21:36:46 -06:00
Aleksei Smirnov e6a88c440b
Fix inconsistent null handling in DataFrame Arithmetics (#6770)
* Fix inconsistent null handling in DataFrame Arithmetics

* Fix Null Count and division by zero issues

* Minor changes to restart build and rerun flaky tests
2023-08-31 10:27:49 -06:00
Aleksei Smirnov e3f53a4497
Fix DataFrame.LoadCsv can not load CSV with duplicate column names (#6772) 2023-08-30 20:38:24 -06:00
Aleksei Smirnov 39235a76d9
Fix issue with addIndexColumn in DataFrame.LoadCsv (#6769)
* Fix issue with addIndexColumn in DataFrame.LoadCsv

* Fix tests
2023-08-25 15:59:41 -06:00
zewditu Hailemariam 179f7dc781
Add TargetType to Type_convert (#6785)
* Add target Type in convert  type

* Add custom type "DataKind"

* clean

* Add DataKind name space

* clean test
2023-08-25 10:57:35 -07:00
Lehonti Ramos 077a6b8196
Modernized some argument checks that still used string literals for parameter names (#6766)
Co-authored-by: John Doe <john@doe>
2023-08-06 19:49:12 -07:00
Michael Sharp 8952994c67
fixed mac build and minor torch sharp changes (#6776) 2023-07-27 21:46:14 -06:00
Michael Sharp 65c7ca9d9a
Add NameEntityRecognition and Q&A deep learning tasks. (#6760)
* NER

* QA almost done, runtime error

* QA finished

* fixes from PR comments

* fixed build

* build fixes

* perf changes

* made disposable

* fixed not disposing model

* added some disposables to TensorFlow for memory

* build testing

* fixing build

* added missing dispose

* build fixes

* build fixes

* testing macos fix
2023-07-24 13:47:24 -06:00
Aleksei Smirnov 69cc4bcb76
Fix incorrect DataFrame min max computation with NULL (#6734)
* Step 1

* Step 2

* Fixed code review findings
2023-07-07 15:22:02 -07:00
Aleksei Smirnov 578d7bcb67
Provide ability to filter dataframe column by null via ElementWise Methods (#6723)
* Provide ability to filter by null value

* Add comments

* Fix code review findings
2023-07-07 08:58:42 -07:00
Aleksei Smirnov caee3c2e2d
Reduce coupling of Data.Analysis.Tests project (#6759) 2023-07-07 02:41:49 -07:00
Aleksei Smirnov d9e1ee1e27
Run tests that requires more than 2 Gb of Memory only on 64-bit env (#6758) 2023-07-07 02:39:47 -07:00
Aleksei Smirnov 26c24463a8
Fix DataFrame to allow to store columns with size more than 2 Gb (#6710)
* Fix error with allocating more than MaxCapacity of Byte Memory Buffer

* Remove Unit test as it consumes too much memory

* Fix issue with increasing buffer capacity over limit when double it size
2023-07-06 12:33:39 -07:00
Aleksei Smirnov 53c0f269d6
Fix the behavior or column SetName method (#6676)
* Fix the behavior or column SetName method

* Fix stack overflow exception

* Fix merge issues

---------

Co-authored-by: Michael Sharp <51342856+michaelgsharp@users.noreply.github.com>
2023-07-06 12:17:20 -07:00