Граф коммитов

2798 Коммитов

Автор SHA1 Сообщение Дата
Michael Sharp d808e4ce16 added in release notes for 3.0.1 2024-01-18 18:01:12 +00:00
github-actions[bot] f7a3b611cb
[release/3.0] Torch sharp version updates and test fixes (#6958)
* torch sharp version updates and test fixes

* removed extra test file

* job templat updated for test coverage

---------

Co-authored-by: Michael Sharp <misharp@microsoft.com>
2024-01-17 10:57:39 -08:00
github-actions[bot] ec498d87a2
[release/3.0] fix #6949 (#6953)
* fix #6949

* add MoveForwardToAttribute to assembly.cs and fix search space sub namespace

---------

Co-authored-by: XiaoYun Zhang <xiaoyuz@microsoft.com>
2024-01-12 21:12:00 -07:00
Michael Sharp a5b5c7a82a
[release/3.0] Split out non concurrent test collections. (#6950)
* Split out non concurrent test collections.

* Fix Core Tests

* removed torchsharp and tensorflow tests when not x64

* fixes from pr comments

* removing unnecessary line from the proj file
2024-01-10 10:10:10 -07:00
dotnet-maestro[bot] f602581005
[release/3.0] Update dependencies from dotnet/arcade (#6938)
* Update dependencies from https://github.com/dotnet/arcade build 20231220.2

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XUnitExtensions
 From Version 8.0.0-beta.23265.1 -> To Version 8.0.0-beta.23620.2

* Fixed version update breaks.

* Update XUnitVersion

* Update MicrosoftMLOnnxRuntimeVersion to 1.16.3

* Rollback OnnxRuntime and suppress warning

* Update to Xunit with fix for https://github.com/xunit/xunit/issues/2821

* Update Centos docker containers

* Fix packaging step

* Try including stdint.h to fix missing uint8_t on centos

* Update Centos test queue

* Attempt to use runtime centos-stream8-helix container for tests

* Use centos-stream8-mlnet-helix container for testing

* Undo changes to test data

* Make NETFRAMEWORK ifdef versionless

* Only use semi-colons for NoWarn

* Fix assert by only accessing idx (#6924)

Asserting on `_rowCount <  Utils.Size(_valueBoundaries)` was catching a
case where `_rowCount`'s update was reordered before `_valueBoundaries`

This was unnecessary, since this method doesn't need to use `_rowCount`.

Instead, make the asserts use only `idx` which will be maintained
consistent with the waiter logic in this cache.
Ensure we only ever use `_rowCount` from the caching thread, so write
reordering won't matter.

* Don't include the SDK in our helix payload (#6918)

* Don't include the SDK in our helix payload

I noticed that the tests included the latest SDK - including the host -
in our helix payloads.

This is a large amount of unnecessary downloads and it also makes it so
we use the latest host on the older frameworks which can fail when the
latest host drops support for distros.

Since our tests shouldn't need the full CLI, remove this from our helix
payloads.

We'll instead get just the runtime we need through `AdditionalDotNetPackage`

* Place Helix downloaded runtime on the PATH

Helix only sets the path when the CLI is included, however we don't
need the CLI.

* Make double assertions compare with tolerance instead of precision (#6923)

Precision might cause small differences to round to a different number.
Instead compare with a tolerance which is not sensitive to rounding.

---------

Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>
Co-authored-by: Michael Sharp <misharp@microsoft.com>
Co-authored-by: Eric StJohn <ericstj@microsoft.com>
2024-01-09 10:08:04 -07:00
github-actions[bot] 4f268c7e52
[release/3.0] Fixes NER to correctly expand/shrink the labels (#6946)
* ner options fix

* Ner fixed.

* Update src/Microsoft.ML.Tokenizers/Model/EnglishRoberta.cs

Co-authored-by: Eric StJohn <ericstj@microsoft.com>

* fixes from PR comments

* fixed build

---------

Co-authored-by: Michael Sharp <misharp@microsoft.com>
Co-authored-by: Michael Sharp <51342856+michaelgsharp@users.noreply.github.com>
Co-authored-by: Eric StJohn <ericstj@microsoft.com>
2024-01-08 17:45:32 -07:00
github-actions[bot] 66c06e63c1
Rename NameEntity to NamedEntity (#6945)
Co-authored-by: Eric StJohn <ericstj@microsoft.com>
2024-01-08 14:40:37 -08:00
Eric StJohn f5554803ac
Branding for 3.0.1 (#6943)
* Branding for 3.0.1

* More version updates for 3.0.1
2024-01-08 11:30:40 -07:00
Michael Sharp d96d7b741e
Release notes for 3.0 (#6888)
* release notes 3.0

* Apply suggestions from code review

Co-authored-by: Jeff Handley <jeffhandley@users.noreply.github.com>

* updates from pr

---------

Co-authored-by: Jeff Handley <jeffhandley@users.noreply.github.com>
2023-11-22 12:35:14 -07:00
Michael Sharp db08da61b8
added in win-arm64 (#6813)
* added in win-arm64

* fixed MKL arm64 cmake issue

* right helix queue

* added in win-arm64

* fixed MKL arm64 cmake issue

* right helix queue

* fixing arm tests

* makes x64 test detection better

* change test label

* fixed onnx files not being included

* added new win-arm baselines

* baseline changes

* fixed build issue

* fixed test

* one more basleine

* fixed pack for mkl redist

* .NET update
2023-11-20 22:05:11 -07:00
Michael Sharp aeb1ab8f6b
Updates LightGBM from 2.X.X to 3.X.X (#6880)
* updated lightGBM version

* finished LightGBM baseline update

* removed accidental enums

* Fixed LightGBM Test

* fixed lightgbm

* fixed version

* fixed test with LightGBM

* fixed test result
2023-11-16 16:42:38 -07:00
Michael Sharp d2cf997d90
Changes some of the CPU Math implemenation from our current version to use the new TensorPrimitives package. (#6875)
* using tensor primitives

* added missing files

* some with indexes changed

* Initial swap for TensorPrimitives done

* Rebased and cleaned code

* more minor cleanup

* added system.numerics.tensors version to props

* build fixes

* added net6 again

* updates from PR comments

* fixed sumabsu

* fixed baseline tests

* test fixes

* fixed test failure for kmeans

* changed decimal comparison

* updated more baselines

* Test fixes.

* template update

* Test Fixes.

* fixed performance test csproj

* added baselines for linux arm/64

* fixed linux arm baselines

* fixed arm baselines

* removed extra files

* arm32 baselines updated

* fixed arm baselines
2023-11-14 22:46:15 -07:00
Jeff Handley d8ad1e65b5
FabricBot: Remove area pod project board automation (#6881) 2023-11-14 15:10:26 -07:00
Aleksei Smirnov 796cb354fb
Improve performance of DataFrame binary comparison operations (#6869)
* Increase performance of elementwise comparison operations

* Fix Perf Test

* Fix code review findings
2023-10-22 22:23:12 -07:00
Akash Kundu a3d3813511
Update DataViewRowCursor.md (#6855)
* Update DataViewRowCursor.md

fixed some errors and typos.

* Update docs/code/DataViewRowCursor.md

---------

Co-authored-by: Eric StJohn <ericstj@microsoft.com>
2023-10-19 14:52:58 -07:00
Aleksei Smirnov e82575021e
Avoid Boxing/Unboxing on accessing elements of VBufferDataFrameColumn (fix merge issues) (#6867)
* Avoid Boxing/Unboxing on accessing elements of VBufferDataFrameColumn

* Avoid boxing for vbuffer column
2023-10-17 13:31:50 -07:00
Aleksei Smirnov 766569b86a
Avoid Boxing/Unboxing on accessing elements of VBufferDataFrameColumn (#6865) 2023-10-16 16:32:43 -07:00
Aleksei Smirnov f7b8d560cd
Implement vectorized binary arithmetic operations (#6854)
* Remove separate enums for scalar operations

* Align implementation with .Net 8.0 Tensor  API

* Fix Modulo

* Implement vectorized Arithmetic binary operations

* Implement vectorized Arithmetics binary scalar operations
2023-10-16 14:59:49 -07:00
Aleksei Smirnov 9c183fc35b
Fix Saving csv with VBufferDataFrameColumn (#6860) 2023-10-13 22:23:43 -07:00
Diego Colombo e3ec250d51
update .NET Interactive (#6857) 2023-10-11 20:45:31 -04:00
Aleksei Smirnov 64d7ebd093
Fixes incorrect work of DataFrame with VBufferColumn when number of e… (#6851)
* Fixes incorrect work of DataFrame with VBufferColumn when number of elements is greater than Int.MaxValue

* Fix calculation of max capacity and amount of required buffers

* Fix unit test

* Run test allocating more than 2 Gb of memory on 64bit env only

* Fix StringDataFrameColumn same way as VBufferDataFrameColumn

* Fix wrong amount of buffers created in constructor of StringDataFrameColumn

* Fix code review findings
2023-10-04 11:04:47 -07:00
Aleksei Smirnov 5cf6051db7
Increase performance of arithmetic operations by enhancing calculations on nullable values (#6846)
* Optimize PrimitiveColumnContainer.Clone method

* Avoid unnecessary type conversion during binary operations

* Remove using

* Fix DataFrameBuffer constructor

* remove uncorrectly added using

* Make DataFrameBuffer Length field protected

* Add performance tests

* Split Test for AppendMany into 4 different tests

* Block init of null validity buffer instead of setting individual bits

* Add unit tests for PrimitiveDataFrameColumn.Clone

* Fixes #6821

* Fix

* Add extra tests

* Fix

* Fix typo

* Fix Divide_Int16 and Divide_Int32_Int16 benchmarks

* Fix

* Avoid using constructor, that copies memory

* First step of tt refactoring

* Step 2

* Step 3

* Move iteration over buffers outside of the PrimitiveDataFrameColumnArithmetic

* Change PrimitiveDataFrameColumnArithmetic

* Fix typo

* Use RawSpan

* Fix bug with AppendMany values to not empty column

* Restart unit tests

* Add more unit tests

* Add GetBitCount method

* Fix failing unit test

* Implementation

* Change unit tests

* Update unit tests

* Refactoring BinaryOperation

* Intermediate changes

* Intermediate results

* Implement Binary Scalar Reverse Operarions

* Add implementation for BinaryIntOperations

* Implement Comparison Operations

* Implement actual calculations for Comparison operations

* Uncomment performance tests

* Remove unintentional code changes

* Add reference to Apache Arrow project license in THIRD-PARTY-NOTICES

* Fix license issues
2023-10-02 17:04:03 -07:00
Aleksei Smirnov 3c625bf542
6847 incorrectly sets column value (#6849)
* Fix DataFrame incorrectly sets column value for index higher than Buffer.MaxCapacity

* Revert renaming
2023-10-02 16:07:10 -07:00
Aleksei Smirnov 7fe293da31
PrimitiveDataFrameColumn.Clone method crashes when is used with IEnumerable mapIndices argument (#6822)
* Split Test for AppendMany into 4 different tests

* Block init of null validity buffer instead of setting individual bits

* Add unit tests for PrimitiveDataFrameColumn.Clone

* Fixes #6821

* Fix

* Fix bug with AppendMany values to not empty column

* Restart unit tests

* Add more unit tests

* Fix failing unit test

* Fix code review findings
2023-09-27 15:28:28 -07:00
Eric StJohn 97926a8c53
Update dependencies (#6837)
* Update dependencies

* Add reference to NuGet.Packaging.Core
2023-09-27 11:16:01 -07:00
R. G. Esteves 66eed89f6b
Addresses #6533 (#6838)
* Initial structure and started fleshing out some sections

* Some corrections and paragraph on DL usages

* Starting fleshing out DL on ML.NET section

* Addresses #6533
2023-09-27 11:35:51 -06:00
Aleksei Smirnov 85ee6e5117
Simplify tt files for PrimitiveDataFrameColumnAritmetics (#6830)
* First step of tt refactoring

* Step 2

* Step 3
2023-09-26 18:36:11 -07:00
Aleksei Smirnov 5648c89bbd
Improve performance of column cloning inside DataFrame arithmetics (#6814)
* Optimize PrimitiveColumnContainer.Clone method

* Avoid unnecessary type conversion during binary operations

* Remove using

* Fix DataFrameBuffer constructor

* remove uncorrectly added using

* Make DataFrameBuffer Length field protected

* Fix typo

* Use RawSpan
2023-09-26 18:28:39 -07:00
Aleksei Smirnov 15e6a556ce
Add performance benchmarks for dataframe arithmetic operations (#6827)
* Add performance tests

* Add extra tests

* Fix

* Fix typo

* Fix Divide_Int16 and Divide_Int32_Int16 benchmarks

* Fix

* Change csproj file

* Update BenchmarkDotNetVersion to 0.13.5

* Fix

* Change to 0.13.1 because that is what is latest version in our nuget feeds.

---------

Co-authored-by: Jake Radzikowski <JakeRad@Microsoft.com>
2023-09-26 10:23:39 -07:00
Xiaoyun Zhang a052146947
update interactive kernel version (#6836)
* update interactive kernel version

* update

* Update Microsoft.Data.Analysis.Interactive.Tests.csproj
2023-09-26 01:26:20 -07:00
Raffaello Fraboni 49824f3915
Fix wrong type conversion on PrimitiveDataFrameColumn (#6834)
* Fix wrong type conversion on PrimitiveDataFrameColumn

* Added tests for #6829

* Fix test

* Add file generated from tt template and fix unit tests

---------

Co-authored-by: Aleksei Smirnov <tlalok@inbox.ru>
2023-09-25 10:52:21 -07:00
Michael Sharp 09b80f8a08
removed codecov token (#6811) 2023-09-08 13:13:01 -06:00
Aleksei Smirnov d6927515d8
Append dataframe rows based on column names (#6808)
* Append dataframe rows based on column names

* Update DataFrame.cs

---------

Co-authored-by: Michael Sharp <51342856+michaelgsharp@users.noreply.github.com>
2023-09-01 22:06:55 -06:00
Aleksei Smirnov d9dbf99d97
Allow to define CultureInfo for parsing values on reading DataFrame from csv (#6782)
* Use CultureInfo for parsing values in csv file

* Fix merge issues
2023-08-31 21:36:46 -06:00
Lehonti Ramos ccf34e370b
File-scoped namespaces in files under `Prediction` (`Microsoft.ML.Core`) (#6792)
Co-authored-by: Lehonti Ramos <john@doe>
2023-08-31 21:36:33 -06:00
Aleksei Smirnov e6a88c440b
Fix inconsistent null handling in DataFrame Arithmetics (#6770)
* Fix inconsistent null handling in DataFrame Arithmetics

* Fix Null Count and division by zero issues

* Minor changes to restart build and rerun flaky tests
2023-08-31 10:27:49 -06:00
Lehonti Ramos aaf226c7e7
File-scoped namespaces in files under `Data` (`Microsoft.ML.Core`) (#6789)
Co-authored-by: Lehonti Ramos <john@doe>
2023-08-30 22:21:45 -06:00
Lehonti Ramos 34389b63e5
File-scoped namespaces in files under `ComponentModel` (`Microsoft.ML.Core`) (#6788)
Co-authored-by: Lehonti Ramos <john@doe>
2023-08-30 22:21:19 -06:00
Aleksei Smirnov e3f53a4497
Fix DataFrame.LoadCsv can not load CSV with duplicate column names (#6772) 2023-08-30 20:38:24 -06:00
Aleksei Smirnov 39235a76d9
Fix issue with addIndexColumn in DataFrame.LoadCsv (#6769)
* Fix issue with addIndexColumn in DataFrame.LoadCsv

* Fix tests
2023-08-25 15:59:41 -06:00
Lehonti Ramos 43a6a81185
File-scoped namespaces in files under `EntryPoints` (`Microsoft.ML.Core`) (#6790)
Co-authored-by: Lehonti Ramos <john@doe>
2023-08-25 15:58:35 -06:00
Lehonti Ramos 92eccadb19
File-scoped namespaces in files under `Environment` (`Microsoft.ML.Core`) (#6791)
Co-authored-by: Lehonti Ramos <john@doe>
2023-08-25 15:58:05 -06:00
zewditu Hailemariam 179f7dc781
Add TargetType to Type_convert (#6785)
* Add target Type in convert  type

* Add custom type "DataKind"

* clean

* Add DataKind name space

* clean test
2023-08-25 10:57:35 -07:00
Michael Sharp c28d5af9f4
removed deprecated yosemite brew (#6805) 2023-08-24 12:56:15 -07:00
Lehonti Ramos 077a6b8196
Modernized some argument checks that still used string literals for parameter names (#6766)
Co-authored-by: John Doe <john@doe>
2023-08-06 19:49:12 -07:00
zewditu Hailemariam ea84d429a1
Add QA sweepable estimator in AutoML (#6781)
* Add QA sweepable

* clean
2023-08-03 12:24:11 -07:00
Aleksei Smirnov a823199307
Improve DataFrame Arithmetics implementation (#6763)
* Change methods signature generation

* Change DataFrameColumn Arithmetics

* Change DataFrameColumn Operations

* Fix unit tests

* Fix spaces

* Fix code review findings
2023-07-28 07:58:03 -07:00
Michael Sharp 8952994c67
fixed mac build and minor torch sharp changes (#6776) 2023-07-27 21:46:14 -06:00
Xiaoyun Zhang 7b6af06545
fix issue (#6768) 2023-07-24 16:23:09 -06:00
Michael Sharp 65c7ca9d9a
Add NameEntityRecognition and Q&A deep learning tasks. (#6760)
* NER

* QA almost done, runtime error

* QA finished

* fixes from PR comments

* fixed build

* build fixes

* perf changes

* made disposable

* fixed not disposing model

* added some disposables to TensorFlow for memory

* build testing

* fixing build

* added missing dispose

* build fixes

* build fixes

* testing macos fix
2023-07-24 13:47:24 -06:00