Граф коммитов

10 Коммитов

Автор SHA1 Сообщение Дата
Samuel Lee 982858166c Merged PR 11150425: Arm64 server perf work
## Description:

+ Improve `SymCryptFdefMontgomeryReduceAsm`
  + Reduce instruction count in the inner loop - remove superfluous `adc` with zero
  + Special case first iteration of the reduction loop to further reduce instruction count and multiplication uops
  + For ease of phrasing used non-volatile registers in aapcs64 assembly for the first time, and had to slightly extend SymCryptAsm processor script for this.
+ Improve `SymCryptFdefRawSquareAsm` by tweaking to reduce undue dependencies.

+ More room for improvements in follow-on PR, but checking in what we have to get improvements before GE deadline.

## Admin Checklist:
- [X] You have updated documentation in symcrypt.h to reflect any changes in behavior
- [X] You have updated CHANGELOG.md to reflect any changes in behavior
- [X] You have updated symcryptunittest to exercise any new functionality
- [X] If you have introduced any symbols in symcrypt.h you have updated production and test dynamic export symbols (exports.ver / exports.def / symcrypt.src) and tested the updated dynamic modules with symcryptunittest
- [X] If you have introduced functionality that varies based on CPU features, you have manually tested with and without relevant features
- [X] If you have made significant changes to a particular algorithm, you have checked that performance numbers reported by symcryptunittest are in line with expectations
- [X] If you have added new algorithms/modes, you have updated the status indicator text for the associated modules if necessary
2024-07-26 02:18:13 +00:00
Mitch Lindgren 🦎 84c69fcda1 Merged PR 11000448: Tidying and small build fixes
- !10935012 added a `.gitattributes` file to try to enforce consistent Windows-style line endings, but this causes a bunch of spurious diffs to show up after checking out the latest branch (ironically, on Windows only). See [this Stack Overflow question](https://stackoverflow.com/questions/5787937/git-status-shows-files-as-changed-even-though-contents-are-the-same) which refers to a similar issue. After fighting with Git for a bit, it seems like the easiest fix is just to remove this file.
- Workaround for Python versions < 3.11 not being able to parse timestamps with the 'Z' suffix indicating UTC time (started breaking our pipeline builds due to a recent Git version update)
- Fix for Python 3.12 complaining about invalid escape characters in `symcryptasm_processor.py` (use raw strings)
- When building OpenSSL, pin to a specific tag if no branch is specified on the command line, so that we're not building against a moving target
2024-06-27 01:03:26 +00:00
Changyu Li b6a267815e Merged PR 9576163: Build SymCrypt with gcc-arm-linux-gnueabihf 2023-10-13 19:26:47 +00:00
Samuel Lee 258ce3e751 Merged PR 9190547: Add 256b and 384b specific modular Arm64 SymCryptAsm
+ Add 256b and P384 specific modular Arm64 SymCryptAsm

Related work items: #45077512
2023-07-06 21:04:57 +00:00
Mitch Lindgren 🦎 515bc99971 Merged PR 8235253: Enable OneBranch pipelines
This change rewrites our Azure DevOps pipelines to be compatible with OneBranch pipelines. It also adds new scripts to help with building, testing and packaging SymCrypt. These scripts replicate some of the functionality of `scbuild` but are also compatible with Linux builds. They can be used directly on the command line by developers, but the OneBranch pipeline also uses them to move as much as possible of the "business logic" of building SymCrypt out of the YAML templates and into Python scripts.

Also includes various reorganization and small fixes.
2023-01-12 00:52:49 +00:00
Cagdas Calik b33907817d Merged PR 7581277: Add optimized SHA-2 implementations
Add the following SHA-2 implementations:
- SHA-256 SSSE3+BMI2 intrinsics implementation with 4-way parallel message expansion
- SHA-256 SSSE3+BMI2 assembly implementation with 4-way parallel message expansion
- SHA-256 AVX2+BMI2 intrinsics implementation with 8-way parallel message expansion
- SHA-256 AVX2+BMI2 assembly implementation with 8-way parallel message expansion
- SHA-512 AVX2+BMI2 intrinsics implementation for single-block processing
- SHA-512 AVX2+BMI2 intrinsics implementation with 2-way parallel message expansion
- SHA-512 AVX2+BMI2 intrinsics implementation with 4-way parallel message expansion
- SHA-512 AVX2+BMI2 assembly implementation with 4-way parallel message expansion
- SHA-512 AVX-512 assembly implementation with 4-way parallel message expansion

Other changes:
 - Add INCLUDE directive to `symcryptasm_processor.py`
 - Update `symcryptasm_processor.py` to support saving non-volatile Xmm registers and allocating stack space
 - Update feed-forwarding step of block processing in C implementations
 - Use alternative expressions for LSIGMA and CSIGMA functions in SHA-512 C implementation
 - Fix updating of pcbRemaining in `SymCryptSha512AppendBlocks_ull2`

Related work items: #38759923, #38958807
2022-07-20 03:36:52 +00:00
Samuel Lee 2bc541799d Merged PR 6438924: Enable SymCryptAsm for Arm64
+ Extends SymCryptAsm format and script to work in the Arm64 context
  + Now specify architecture, assembler, and calling convention in script invocation
+ Make various changes to assembly to remove redundant instructions, and generally
 slightly improve perf for all platforms (a couple of % here and there)
+ Use assembly routines in Linux builds and remove asmstubs file
+ Do not enable Windows Arm64 build with CMake yet

Related work items: #35613721
2021-09-20 08:25:04 +00:00
Samuel Lee 0e232d4392 Merged PR 6315721: OACR fixups
+ Resolves all issues flagged by runoacr in symcrypt\lib
  + Leaves some oacr issues in test code
+ Also includes some unrelated fixes to typos etc.

Related work items: #35052770
2021-08-04 15:18:36 +00:00
Samuel Lee a667baeaa5 Merged PR 5976866: Compilation tweaks for CMake and AES-GCM
+ Add more optimization flags for MSVC in CMake to get closer to parity
  between Razzle and CMake builds
+ Make some AES-GCM tweaks for GCC/clang to avoid aggressive loop
  peeling which hurts performance by unduly increasing code size

Related work items: #32785997
2021-04-26 18:49:53 +00:00
Samuel Lee 77d1e446e4 Merged PR 5854070: Introduce symcryptasm format to enable use of asm in Windows and Linux
+ Introduce a 2 stage pre-processing setup to convert .symcryptasm to either masm (msft x64
calling convention) or gas (SystemV amd64 calling convention)
  + Step 1 converts .symcryptasm to .cppasm (using `lib\symcryptasm_processor.py`)
  + Step 2 converts .cppasm  to .asm using the C preprocessor
+ Updated CMakeLists.txt to invoke this preprocesssing when any relevant files is updated
+ Also introduced makefile.inc for the razzle build
+ I have translated all of the amd64 asm files we want to preserve, and the performance for big
integer reliant code is the same on Windows and Linux (and a bit better on Windows than before :))
+ In translation I did some tidying of the underlying assembly:
  + Removing needless work (some size specific functions in particular had cruft from their
adaptation from the generic sized versions)
  + Reducing code size (i.e. by using inc/dec rather than add/sub 1)
  + Some micro-optimizations to remove needless instruction dependencies

Related work items: #30621935
2021-04-23 11:33:23 +00:00