Граф коммитов

2943 Коммитов

Автор SHA1 Сообщение Дата
Aleksei Smirnov 12411fc678
Fix broken inheritance from DataFrameColumn class (#7324) 2024-11-27 10:45:01 -07:00
Xiaoyun Zhang a4c67fe26f
[GenAI] SFT Example (#7316)
* implement sft

* add causalLMDataset

* update

* add SFT trainer

* update

* update

* disable x64 test on non-x64 machine

* support batch
2024-11-25 13:47:15 -08:00
Emmanuel Ferdman 5d0dafbf9c
Update dynamic loading report reference (#7321)
Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
2024-11-20 10:29:19 -07:00
Aleksei Smirnov cfc1fb5df2
Update System.Numerics.Tensors version (#7322)
* Update System.Numerics.Tensors version

* Fix build
2024-11-20 10:28:50 -07:00
Eric StJohn d4bc05db15
Update version for 5.0 (#7311) 2024-11-18 12:36:03 -08:00
Michael Sharp 509032755b
Fixing tokenizers version (#7309)
* fixing tokenizers version

* fixes
2024-11-13 16:16:30 -07:00
Michael Sharp 442d51cffb
mp updates (#7304) 2024-11-12 17:39:11 -07:00
Michael Sharp 26b9d65da9
4.0 release notes (#7302) 2024-11-12 13:19:13 -07:00
Carlos Sánchez López 4a6ce50e80
Update dependencies from maintenance-packages to latest versions (#7301)
* Update dependencies from maintenance-packages to latest preview

* Fix sni.dll failure
2024-11-11 22:23:47 -07:00
Michael Sharp e968d3226c
Fixing native lookup (#7282)
* enable m1 tests and fixing native lookup

* Mac testing

* mac test

* macos test

* mac testing

* reverting mac changes

* testing mac temp fix

* fixed cmake

* print DYLD libraries

* osx lightgbm test

* lightgbm dyld log

* excluding osx x64 from some tests
2024-11-11 15:47:59 -07:00
Michael Sharp 3e2ba817bd
Updated remote executor (#7295)
* updated remote executor

* fixing test reference issue

* full version string

* old memory version

* removed nowarn
2024-11-11 14:25:02 -07:00
Tarek Mahmoud Sayed 86112114fd
Final tokenizer's cleanup (#7291) 2024-11-08 10:37:27 -08:00
Xiaoyun Zhang 3659a485a0
[GenAI] Introduce CausalLMPipelineChatClient for MEAI.IChatClient (#7270)
* leverage MEAI abstraction

* Update src/Microsoft.ML.GenAI.LLaMA/Llama3CausalLMChatClient.cs

Co-authored-by: Stephen Toub <stoub@microsoft.com>

* Update src/Microsoft.ML.GenAI.LLaMA/Llama3CausalLMChatClient.cs

Co-authored-by: Stephen Toub <stoub@microsoft.com>

* Update src/Microsoft.ML.GenAI.Phi/Phi3/Phi3CausalLMChatClient.cs

Co-authored-by: Stephen Toub <stoub@microsoft.com>

* fix comments

* Update Microsoft.ML.GenAI.Core.csproj

---------

Co-authored-by: Stephen Toub <stoub@microsoft.com>
2024-11-05 09:27:05 -08:00
Michael Sharp 5b4981a134
Update To MacOS 13 (#7285)
* mac 13 testing

* build fixes
2024-11-04 17:41:25 -07:00
Tarek Mahmoud Sayed 5c503195e8
Add Timeout to Regex used in the tokenizers (#7284)
* Add Timeout to Regex used in the tokenizers

* Address the feedback
2024-11-04 08:06:54 -08:00
Tarek Mahmoud Sayed 7cce7535b7
Add the components governance file `cgmanifest.json` for tokenizer's vocab files (#7283)
* Add the governance file cgmanifest.json for tokenizer's vocab files

* Address the feedback

* apply more schema requirements on the doc
2024-11-01 15:20:59 -07:00
Tarek Mahmoud Sayed a9b4212eb3
Address the feedback regarding Bert tokenizer (#7280)
* Address the feedback regarding Bert tokenizer

* Small fix
2024-10-26 16:07:20 -07:00
Michael Sharp a7a6d88b05
Adds in a way to add settings for the MLContext. (#7273)
* api, no tests

* updates from pr

* fixed rebase errors and pr comments

* updates based on ONNX team
2024-10-22 16:27:04 -06:00
Michael Sharp 869dc9fcfa
fixing osx ci (#7279) 2024-10-22 14:46:20 -06:00
Tarek Mahmoud Sayed 81122c4c48
Introducing WordPiece and Bert tokenizers (#7275)
* Introducing WordPiece and Bert tokenizers

* Fix corner case in WordPiece
2024-10-22 12:33:58 -07:00
Michael Sharp 32bac5e395
fixing apple silicon official build (#7278) 2024-10-22 10:55:12 -06:00
Eui Jung f385b06aa0
Fixes #7271 AOT for ML.Tokenizers (#7272)
* AOT for ML.Tokenizers

* Forgot to add ModelSourceGenerationContext

* Update src/Microsoft.ML.Tokenizers/Model/ModelSourceGenerationContext.cs

Co-authored-by: Eirik Tsarpalis <eirik.tsarpalis@gmail.com>

---------

Co-authored-by: Eirik Tsarpalis <eirik.tsarpalis@gmail.com>
2024-10-17 14:16:22 -07:00
Tarek Mahmoud Sayed 823fc175f4
Misc Changes (#7264)
* Add o1 model support

* Replace Usage of tuples with Range in EncodedToken and Remove TorchSharp Range/Index implementation

* Rename SentencePieceBpeTokenizer to allow adding more models to it in the future.

* Make Tokenizer.Decode returns non-nullable string

* Make BPE tokenizer support added tokens

* add net9 package source to the nuget.config file

* Rename TiktokenPreTokenizer to RegexPreTokenizer
2024-10-11 16:06:22 -07:00
Michael Sharp e7943427f4
Load onnx model from Stream working (#7254) 2024-10-09 11:07:20 -06:00
Aleksei Smirnov 1843bcdfa6
Fix dataframe incorrectly parse CSV when renameDuplicatedColumns is true (#7242)
* Fix dataframe incorrectly parse CSV when renameDuplicatedColumns is true

* Update System.Numerics.Tensors dependency
2024-10-09 09:46:43 -06:00
Eric StJohn 9baf26b58d
Update wording (#7253) 2024-10-07 14:33:42 -07:00
Weihan Li f7d4184f27
docs: update nuget package badge (#7236) 2024-10-07 13:41:06 -06:00
dotnet-maestro[bot] 4919b43668
[main] Update dependencies from dotnet/arcade (#7235)
* Update dependencies from https://github.com/dotnet/arcade build 20240909.1

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
 From Version 10.0.0-beta.24430.1 -> To Version 10.0.0-beta.24459.1

* Update dependencies from https://github.com/dotnet/arcade build 20240917.1

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
 From Version 10.0.0-beta.24459.1 -> To Version 10.0.0-beta.24467.1

* Update dependencies from https://github.com/dotnet/arcade build 20240926.2

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
 From Version 10.0.0-beta.24467.1 -> To Version 10.0.0-beta.24476.2

* Update dependencies from https://github.com/dotnet/arcade build 20241004.4

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
 From Version 10.0.0-beta.24476.2 -> To Version 10.0.0-beta.24504.4

---------

Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>
2024-10-07 13:40:50 -06:00
Stephen Toub 3f042f632b
Update tiktoken regexes (#7255) 2024-10-07 13:39:15 -06:00
Xiaoyun Zhang 62c4a13f4d
[GenAI] Support Llama 3.2 1B and 3B model (#7245)
* update

* enable llama3_2

* fix tests
2024-10-06 00:16:36 -07:00
Tarek Mahmoud Sayed 1e914273d3
Move the Tokenizer's data into separate packages. (#7248)
* Move the Tokenizer's data into separate packages.

* Address the feedback

* More feedback addressing

* More feedback addressing

* Trimming/AoT support

* Make data types internal
2024-10-04 14:47:37 -07:00
Eric StJohn 189ba24641
Enable SDL tools (#7247) 2024-10-01 10:38:15 -07:00
Eric StJohn eba09d977c
Add Service Tree ID for .NET Libraries (#7252) 2024-10-01 10:37:34 -07:00
Xiaoyun Zhang be1e428d41
[GenAI] pack GenAI core package (#7246)
* update

* enable llama3_2

* fix tests

* pack GenAI core
2024-09-27 13:10:30 -07:00
Xiaoyun Zhang 817a77f89a
[GenAI] Add Mistral 7B Instruction V0.3 (#7231)
* add mistral and tests

* add test and sample

* add tool call support

* update autogen to v 0.1.0

* update autogen to 0.1.0

* remove tests on non-x64 machien

* add file header

* update

* update

* update ml tokenizer test version

* fix build error

* remove .receive.txt

* Update docs/samples/Microsoft.ML.GenAI.Samples/Mistral/Mistral_7B_Instruct.cs

Co-authored-by: Weihan Li <7604648+WeihanLi@users.noreply.github.com>

* update

* set t to 0

* fix test

* Update Microsoft.ML.GenAI.Mistral.csproj

---------

Co-authored-by: Weihan Li <7604648+WeihanLi@users.noreply.github.com>
2024-09-25 20:15:59 -07:00
Xiaoyun Zhang e14f22fe0d
set <IsPackable>true</IsPackable> to GenAI.Llama and Phi (#7237) 2024-09-16 11:53:03 -07:00
Tarek Mahmoud Sayed 87a41fae4a
Fix decoding special tokens in SentencePiece tokenizer (#7233)
* Fix decoding special tokens in SentencePiece tokenizer

* Apply the change to the other Decode method overload
2024-09-09 15:02:38 -07:00
dotnet-maestro[bot] 4e364e4bda
[main] Update dependencies from dotnet/arcade (#7218)
* Update dependencies from https://github.com/dotnet/arcade build 20240808.2

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
 From Version 9.0.0-beta.24368.9 -> To Version 9.0.0-beta.24408.2

* Update dependencies from https://github.com/dotnet/arcade build 20240816.2

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
 From Version 9.0.0-beta.24408.2 -> To Version 9.0.0-beta.24416.2

* Update dependencies from https://github.com/dotnet/arcade build 20240821.2

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
 From Version 9.0.0-beta.24416.2 -> To Version 9.0.0-beta.24421.2

* Update dependencies from https://github.com/dotnet/arcade build 20240823.2

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
 From Version 9.0.0-beta.24421.2 -> To Version 9.0.0-beta.24423.2

* Update dependencies from https://github.com/dotnet/arcade build 20240830.1

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
 From Version 9.0.0-beta.24423.2 -> To Version 10.0.0-beta.24430.1

---------

Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>
2024-09-02 15:03:10 -06:00
Xiaoyun Zhang 7c937bf81a
[GenAI] Add generateEmbedding API to CausalLMPipeline (#7227)
* add embedding

* add frompretrain api to phi3 model

* fix bug

* Update CausalLMPipeline.cs
2024-08-30 10:30:46 -07:00
Xiaoyun Zhang 1d1cc997d5
Directly refer sql data client 4.8.6 package in GenAI tests to fix security vulnerable package (#7228)
* update approvaltest to 7.0

* Update Versions.props

* add direct reference to sql client in GenAI test packages
2024-08-30 10:30:32 -07:00
MohammadReza SH 1612edc770
Refactor Namespace and Seald Classes in Microsoft.ML.AutoML.SourceGenerator Project (#7223)
* Make all classes in Microsoft.ML.AutoML.SourceGenerator project `sealed`

* Refactor namespace from Microsoft.ML.ModelBuilder.SweepableEstimator.CodeGenerator to Microsoft.ML.AutoML.SourceGenerator
2024-08-29 14:09:44 -07:00
Xiaoyun Zhang 70e5ab1e27
[GenAI] Add LLaMA support (#7220)
* add llama

* add test for tokenizer

* make llama 3.1 working

* update

* add shape test for 70b and 405b

* clean up

* add tests

* update

* fix error

* calculate rotary embedding in model layer

* remove rotary_emb from attention

* update feed

* update .csproj

* Update NuGet.config

* fix test

* pass device

* fix test

* update constructor

* disable 405b test

* update

* disable 70b test

* use windows only fact

* revert change

* rename test to LLaMA3_1
2024-08-28 15:35:31 -07:00
Michael Sharp fa8c822045
Update dependency versions. (#7216)
* version updates and fixes

* fixed torch sharp test nuget

* skipping test again

* fixing typo

* consolidating extensions versioning

* fixed remote executor app
2024-08-07 18:12:44 -06:00
Xiaoyun Zhang 5f5a199a5a
[GenAI] Add readme to Microsoft.ML.GenAI.Phi (#7206)
* add readme

* Update src/Microsoft.ML.GenAI.Phi/README.md

Co-authored-by: Luis Quintanilla <46974588+luisquintanilla@users.noreply.github.com>

* Update src/Microsoft.ML.GenAI.Phi/README.md

Co-authored-by: Luis Quintanilla <46974588+luisquintanilla@users.noreply.github.com>

* Update src/Microsoft.ML.GenAI.Phi/README.md

Co-authored-by: Luis Quintanilla <46974588+luisquintanilla@users.noreply.github.com>

* Update src/Microsoft.ML.GenAI.Phi/README.md

Co-authored-by: Luis Quintanilla <46974588+luisquintanilla@users.noreply.github.com>

* Update src/Microsoft.ML.GenAI.Phi/README.md

Co-authored-by: Luis Quintanilla <46974588+luisquintanilla@users.noreply.github.com>

* Update src/Microsoft.ML.GenAI.Phi/README.md

Co-authored-by: Luis Quintanilla <46974588+luisquintanilla@users.noreply.github.com>

---------

Co-authored-by: Luis Quintanilla <46974588+luisquintanilla@users.noreply.github.com>
2024-08-05 13:36:00 -07:00
Xiaoyun Zhang 9ffb3a3175
increase cancelling time (#7207) 2024-08-02 12:44:00 -06:00
Xiaoyun Zhang 6d1f7e2bf8
Add Microsoft.ML.GenAI.Phi, test package and sample project. (#7184)
* add genai.phi and tests

* formatter

* refactor Phi3Tokenizer

* update

* add configuration for phi-series

* add semantic kernel and autogen intergration

* update

* add Microsoft.ML.GenAI.Sample

* use tokenzier model from testTokenizer package

* use defaults

* add quantize linear

* use version string

* remove special token from CreatePhi2 API

* set up quantize sample

* initialize linear with zeros

* update sample

* add 6.0 to targetframework

* fix tests

* update

* remove Phi3Tokenizer and use LlamaTokenizer instead

* revert change in tokenizer package

* run test on x64

* fix tests

* check in approved file

* run test in net6.0

* use meta device

* copy approval tests to output folder

* set up approval test file location

* fix comment

* rename to AddGenAITextGeneration and AddGenAIChatCompletion

* Update job-template.yml

* add mit license

* add reference

* bump code coverage version

* add <PreserveCompilationContext>true</PreserveCompilationContext>

* add runtime package

* remove flag

* add flag

* fix build error

* update

* update
2024-07-30 15:47:31 -07:00
Tarek Mahmoud Sayed f72c9d22fc
Reduce Tiktoken Creation Memory Allocation (#7202) 2024-07-28 15:55:03 -07:00
dotnet-maestro[bot] 34eb579d4b
[main] Update dependencies from dotnet/arcade (#7165)
* Update dependencies from https://github.com/dotnet/arcade build 20240531.1

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
 From Version 9.0.0-beta.24272.5 -> To Version 9.0.0-beta.24281.1

* Update dependencies from https://github.com/dotnet/arcade build 20240606.4

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
 From Version 9.0.0-beta.24272.5 -> To Version 9.0.0-beta.24306.4

* Update dependencies from https://github.com/dotnet/arcade build 20240614.1

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
 From Version 9.0.0-beta.24272.5 -> To Version 9.0.0-beta.24314.1

* Update dependencies from https://github.com/dotnet/arcade build 20240621.4

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
 From Version 9.0.0-beta.24272.5 -> To Version 9.0.0-beta.24321.4

* Update dependencies from https://github.com/dotnet/arcade build 20240627.1

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
 From Version 9.0.0-beta.24272.5 -> To Version 9.0.0-beta.24327.1

* Update dependencies from https://github.com/dotnet/arcade build 20240702.2

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
 From Version 9.0.0-beta.24272.5 -> To Version 9.0.0-beta.24352.2

* Update dependencies from https://github.com/dotnet/arcade build 20240710.4

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
 From Version 9.0.0-beta.24272.5 -> To Version 9.0.0-beta.24360.4

* Update dependencies from https://github.com/dotnet/arcade build 20240718.9

Microsoft.DotNet.Arcade.Sdk , Microsoft.DotNet.Build.Tasks.Feed , Microsoft.DotNet.Helix.Sdk , Microsoft.DotNet.SignTool , Microsoft.DotNet.SwaggerGenerator.MSBuild , Microsoft.DotNet.XliffTasks , Microsoft.DotNet.XUnitExtensions
 From Version 9.0.0-beta.24272.5 -> To Version 9.0.0-beta.24368.9

* fixed async void tests

* fix bpe test

* fixed auto ml test

* skipping test

---------

Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>
Co-authored-by: Michael Sharp <misharp@microsoft.com>
2024-07-27 18:33:59 -06:00
Erik b4a4b10954
create unique temporary directories to prevent permission issues (#7173)
* minor fix to prevent multiple mldotnet directories

* forgot missing increment
2024-07-26 21:04:17 -06:00
Michael Sharp ba94d7abff
Add package readmes (#7200)
* minor formatting issues, use preview 5, and package readmes

* fixing async void test

* test fixes

* added in ReadExactly extension to stream

* fixes from PR comments

* added types to fast tree

* fixed overflow

* fixed package md file
2024-07-26 21:03:35 -06:00