This adds support for the Trace Hub in Rocket Lake CPUs.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: stable <stable@vger.kernel.org> # v4.14+
Link: https://lore.kernel.org/r/20210414171251.14672-7-alexander.shishkin@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The only usage of them is to pass their address to sysfs_create_group()
and sysfs_remove_group(), both which have pointers to const
attribute_group structs as input. Make them const to allow the compiler
to put them in read-only memory.
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20210414171251.14672-5-alexander.shishkin@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Anything that deals with drvdata structures should leave them intact.
Reflect this in function signatures.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20210414171251.14672-4-alexander.shishkin@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the case where size is zero the while loop never assigns rc and the
return value is uninitialized. Fix this by initializing rc to zero.
Fixes: 639781dcab ("habanalabs/gaudi: add debugfs to DMA from the device")
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Addresses-Coverity: ("Uninitialized scalar variable")
Link: https://lore.kernel.org/r/20210412161012.1628202-1-colin.king@canonical.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix these kernel-doc complaints:
../drivers/greybus/es2.c:79: warning: bad line:
../drivers/greybus/es2.c💯 warning: cannot understand function prototype: 'struct es2_ap_dev '
es2.c:126: warning: Function parameter or member 'cdsi1_in_use' not described in 'es2_ap_dev'
Cc: Johan Hovold <johan@kernel.org>
Cc: Alex Elder <elder@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: greybus-dev@lists.linaro.org (moderated for non-subscribers)
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20210415054338.2223-1-rdunlap@infradead.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
These are the interconnect changes for the 5.13-rc1 merge window
with the highlights being drivers for two new platforms.
Driver changes:
- New driver for SM8350 platforms.
- New driver for SDM660 platforms.
Signed-off-by: Georgi Djakov <djakov@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJgd/GZAAoJEIDQzArG2BZjJ3kP/jgsX2Z5J4/PqKgWaK8iFYs7
5DlDkX4Ogb61LXoAZrrKKAW8HLTKGrGB+AdfBMPT8nU865Js+EdWJXRrOmmPlaUn
CFYct8ABRfip38VLkfyv6LDbfd43lIk8c809JFQiLQ9uFZSy5ZW/Vvh9ZCsMCSF6
Id7hKNzevTM3i0U8mLYJyhvrQ2G5qQUiiuQNeUJxt6Cl2hdzsk13MhdL5pFYwdJA
WAEu3+Wcj3PeMo7+bWJO/VV5MswhHtYhS8hF/HsPYmtDTgNUkkMAXiFArBO1wB8H
RgRLPTOUcJq8YgqWYFoaAVn1Gp2SDWyCK/c0CcLuh/rILrpICCJrHPmoS3V/beJk
B+GPLilSN51KSvvkO/naiS/30lgF2a2355IXOfT4EKyEgVlmMmM8FmdfLkEfqQbI
nTT/mTDghrkCkx+n1mLu1T5o7kty+9PVAz6vz7N2evFSJrUC7VFhsY1ZKDPOBrgY
sr34SYS2SnkCyu32hga/nMKEqYmKLyY2WsQuDM0ExJP+MnvgzNj31fdXh6iRb9YB
34/smWyfBnzyIxIhL8bJiRvnHwIerLKdyCMzO2WMaBY436WeNK7xI38zMLqfpqs0
3ntatJeE1E9Y20FUWwdHuHIvqgL0cbw/0JqqdAqLoU8SmPi+nRX4XOFdVu0ckrTy
SXCb6lWCHVWDI0/kVYEp
=SguC
-----END PGP SIGNATURE-----
Merge tag 'icc-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc into char-misc-next
Georgi writes:
interconnect changes for 5.13
These are the interconnect changes for the 5.13-rc1 merge window
with the highlights being drivers for two new platforms.
Driver changes:
- New driver for SM8350 platforms.
- New driver for SDM660 platforms.
Signed-off-by: Georgi Djakov <djakov@kernel.org>
* tag 'icc-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc:
interconnect: qcom: sm8350: Add missing link between nodes
interconnect: qcom: sm8350: Use the correct ids
interconnect: qcom: sdm660: Fix kerneldoc warning
MAINTAINERS: icc: add interconnect tree
interconnect: qcom: Add SM8350 interconnect provider driver
dt-bindings: interconnect: Add Qualcomm SM8350 DT bindings
interconnect: qcom: icc-rpm: record slave RPM id in error log
interconnect: qcom: Add SDM660 interconnect provider driver
dt-bindings: interconnect: Add bindings for Qualcomm SDM660 NoC
Removes old information related to the stop file interface in sysfs left
by mistake during patch revision.
Improves the document text format to be more user-friendly and adds
basic driver related information, such as support, datasheet, and author.
Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Link: https://lore.kernel.org/r/4e72f931474a784d478e5a67961ecf116911997a.1618066164.git.gustavo.pimentel@synopsys.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
core:
- Added support for Flash Programmer execution environment which allows the
host machine (like x86) to flash the modem firmware to NAND or eMMC in the
modem. The MHI bus will expose EDL channels (34, 35) and then the opensource
QDL tool [1] can be used to flash the firmware from the host.
- Added an internal helper for polling the MHI registers with a retry interval.
This helper is used now to poll for the MHI ready state in MHI STATUS
register.
- Various fixes for issues found during the bringup of SDX24/SDX55 based Quectel
and Telit modems.
- Updates to the Execution environment handling for proper downloading of the
AMSS image from SBL (Secondary Bootloader) mode.
- Added support for sending STOP channel command to the MHI device and also made
changes to the MHI core for proper handling of stop and restart.
- Fixed the runtime_pm handling in the core by forcing the device to be in wake
mode until TX completion and allowing it to suspend for RX.
- Added sanity checks for values read from the device to avoid crash if those
are corrupted somehow.
- Fixed warnings generated by sparse (W=2)
- Couple of kernel doc cleanups in mhi.h
pci_generic:
- Added support for runtime PM and generic PM
- Added Firehose channels for flashing the firmware
- Added support for modems such as Quectel EM1XXGR-L, SDX24, SDX65, Foxconn
T99W175 exposing relevant channels.
[1] https://git.linaro.org/landing-teams/working/qualcomm/qdl.git
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEZ6VDKoFIy9ikWCeXVZ8R5v6RzvUFAmByjAUACgkQVZ8R5v6R
zvXcgQf/cmZ+E7DUXfIutbtxu0WQsprLoi12Z+tNy+0di/6HbstodNYsDJGEMCeg
f5mXClHMTj6uO5aRu+5tgWxA6pNNBeHSpJmztbbxjrtdiAC1tZHXMFCMQ/Mj4Sv5
IFmfHVF/wsMdFJUkfaOWC45mVhPG/TK5Wng86CUSZXUdhgC0AxY0mQqmivTjS5UE
TA1zxCTS7ni97fceGM+V2JlebFYJJ+gwkVVgHhZMF0x+1xNldoNCxjSfMso6EeS1
ThK8bjxYYi/eRcM1jltdv/zWlJbePOTSos5Pkm+NQsauPWtETELKq58MDhLTzD28
aiQ8mx10gsYDGXvXpfh3nsMN1pwOfg==
=lVaD
-----END PGP SIGNATURE-----
Merge tag 'mhi-for-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/mani/mhi into char-misc-next
Manivannan writes:
MHI changes for v5.13
core:
- Added support for Flash Programmer execution environment which allows the
host machine (like x86) to flash the modem firmware to NAND or eMMC in the
modem. The MHI bus will expose EDL channels (34, 35) and then the opensource
QDL tool [1] can be used to flash the firmware from the host.
- Added an internal helper for polling the MHI registers with a retry interval.
This helper is used now to poll for the MHI ready state in MHI STATUS
register.
- Various fixes for issues found during the bringup of SDX24/SDX55 based Quectel
and Telit modems.
- Updates to the Execution environment handling for proper downloading of the
AMSS image from SBL (Secondary Bootloader) mode.
- Added support for sending STOP channel command to the MHI device and also made
changes to the MHI core for proper handling of stop and restart.
- Fixed the runtime_pm handling in the core by forcing the device to be in wake
mode until TX completion and allowing it to suspend for RX.
- Added sanity checks for values read from the device to avoid crash if those
are corrupted somehow.
- Fixed warnings generated by sparse (W=2)
- Couple of kernel doc cleanups in mhi.h
pci_generic:
- Added support for runtime PM and generic PM
- Added Firehose channels for flashing the firmware
- Added support for modems such as Quectel EM1XXGR-L, SDX24, SDX65, Foxconn
T99W175 exposing relevant channels.
[1] https://git.linaro.org/landing-teams/working/qualcomm/qdl.git
* tag 'mhi-for-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/mani/mhi: (49 commits)
bus: mhi: fix typo in comments for struct mhi_channel_config
bus: mhi: core: Fix shadow declarations
bus: mhi: pci_generic: Constify mhi_controller_config struct definitions
bus: mhi: pci_generic: Introduce Foxconn T99W175 support
bus: mhi: core: Sanity check values from remote device before use
bus: mhi: pci_generic: Add FIREHOSE channels
bus: mhi: pci_generic: Implement PCI shutdown callback
bus: mhi: Improve documentation on channel transfer setup APIs
bus: mhi: core: Remove __ prefix for MHI channel unprepare function
bus: mhi: core: Check channel execution environment before issuing reset
bus: mhi: core: Clear configuration from channel context during reset
bus: mhi: core: Hold device wake for channel update commands
bus: mhi: core: Update debug messages to use client device
bus: mhi: core: Improvements to the channel handling state machine
bus: mhi: core: Clear context for stopped channels from remove()
bus: mhi: core: Allow sending the STOP channel command
bus: mhi: pci_generic: Add SDX65 based modem support
bus: mhi: core: Remove pre_init flag used for power purposes
bus: mhi: pm: reduce PM state change verbosity
bus: mhi: core: Fix MHI runtime_pm behavior
...
- Add support to reset device after the user closes the file descriptor.
Because we support a single user, we can reset the device (if needs to)
after a user closes its file descriptor to make sure the device is in
idle and clean state for the next user.
- Add a new feature to allow the user to wait on interrupt. This is needed
for future ASICs
- Replace GFP_ATOMIC with GFP_KERNEL wherever possible and add code to
support failure of allocating with GFP_ATOMIC.
- Update code to support the latest firmware image:
- More security features are done in the firmware
- Remove hard-coded assumptions and replace them with values that are
sent to the firmware on loading.
- Print device unusable error
- Reset device in case the communication between driver and firmware
gets out of sync.
- Support new PCI device ids for secured GAUDI.
- Expose current power draw through the INFO IOCTL.
- Support resetting the device upon a request from the BMC (through F/W).
- Always use only a single MSI in GAUDI, due to H/W limitation.
- Improve data-path code by taking out code from spinlock protection.
- Allow user to specify custom timeout per Command Submission.
- Some enhancements to debugfs.
- Various minor changes and improvements.
-----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEE7TEboABC71LctBLFZR1NuKta54AFAmByAxcTHG9nYWJiYXlA
a2VybmVsLm9yZwAKCRBlHU24q1rngLzgB/4gkltBLgkp+VaIK+fb7uB34CY096M1
e8iO7eqxjrjW3YymBaHVYVuaogv9/XD1pUap/rsvw4Ytvb3g390wLjHhsHcSW0AM
8gIswu0VqWWrxphe0ns+ArV4j6JWVBkUQ1QDxp9Ut0qMaUZha/EkfAengMseQbjR
3oaPwrUCpPpl4XfZaBTxTg3RyHtXnzi3cFw2b207D9iX8DS69TtLgMPAj5xN4vO2
lei/4ZRJw/MbJSwvmNJt2d7E7CniLQh9sy7JnMeinpG+WD4GMdx1m0bI8fIuKQ11
GkvVRREGHuQ0YtvTIWi9K+GAwJNqIIw/cW8M3+P1+7WjLWAKkyOTtYk5
=ZHBU
-----END PGP SIGNATURE-----
Merge tag 'misc-habanalabs-next-2021-04-10' of https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux into char-misc-next
Oded writes:
This tag contains habanalabs driver changes for v5.13:
- Add support to reset device after the user closes the file descriptor.
Because we support a single user, we can reset the device (if needs to)
after a user closes its file descriptor to make sure the device is in
idle and clean state for the next user.
- Add a new feature to allow the user to wait on interrupt. This is needed
for future ASICs
- Replace GFP_ATOMIC with GFP_KERNEL wherever possible and add code to
support failure of allocating with GFP_ATOMIC.
- Update code to support the latest firmware image:
- More security features are done in the firmware
- Remove hard-coded assumptions and replace them with values that are
sent to the firmware on loading.
- Print device unusable error
- Reset device in case the communication between driver and firmware
gets out of sync.
- Support new PCI device ids for secured GAUDI.
- Expose current power draw through the INFO IOCTL.
- Support resetting the device upon a request from the BMC (through F/W).
- Always use only a single MSI in GAUDI, due to H/W limitation.
- Improve data-path code by taking out code from spinlock protection.
- Allow user to specify custom timeout per Command Submission.
- Some enhancements to debugfs.
- Various minor changes and improvements.
* tag 'misc-habanalabs-next-2021-04-10' of https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux: (41 commits)
habanalabs: print f/w boot unknown error
habanalabs: update to latest F/W communication header
habanalabs/gaudi: skip iATU if F/W security is enabled
habanalabs/gaudi: derive security status from pci id
habanalabs: move dram scrub to free sequence
habanalabs: send dynamic msi-x indexes to f/w
habanalabs/gaudi: clear QM errors only if not in stop_on_err mode
habanalabs: support DEVICE_UNUSABLE error indication from FW
habanalabs: use strscpy instead of sprintf and strlcpy
habanalabs: remove the store jobs array from CS IOCTL
habanalabs/gaudi: add debugfs to DMA from the device
habanalabs/gaudi: sync stream add protection to SOB reset flow
habanalabs: add custom timeout flag per cs
habanalabs: improve utilization calculation
habanalabs: support legacy and new pll indexes
habanalabs: move relevant datapath work outside cs lock
habanalabs: avoid soft lockup bug upon mapping error
habanalabs/gaudi: Update async events header
habanalabs/gaudi: unsecure TPC cfg status registers
habanalabs/gaudi: always use single-msi mode
...
When CONFIG_QCOM_SCM is y and CONFIG_HAVE_ARM_SMCCC
is not set, compiling errors are encountered as follows:
drivers/firmware/qcom_scm-smc.o: In function `__scm_smc_do_quirk':
qcom_scm-smc.c:(.text+0x36): undefined reference to `__arm_smccc_smc'
drivers/firmware/qcom_scm-legacy.o: In function `scm_legacy_call':
qcom_scm-legacy.c:(.text+0xe2): undefined reference to `__arm_smccc_smc'
drivers/firmware/qcom_scm-legacy.o: In function `scm_legacy_call_atomic':
qcom_scm-legacy.c:(.text+0x1f0): undefined reference to `__arm_smccc_smc'
Note that __arm_smccc_smc is defined when HAVE_ARM_SMCCC is y.
So add dependency on HAVE_ARM_SMCCC in QCOM_SCM configuration.
Fixes: 916f743da3 ("firmware: qcom: scm: Move the scm driver to drivers/firmware")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: He Ying <heying24@huawei.com>
Link: https://lore.kernel.org/r/20210406094200.60952-1-heying24@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use kmemdup_nul() helper instead of open-coding to
simplify the code in spk_msg_set().
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20210406034434.442251-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When async binder buffer got exhausted, some normal oneway transactions
will also be discarded and may cause system or application failures. By
that time, the binder debug information we dump may not be relevant to
the root cause. And this issue is difficult to debug if without the
backtrace of the thread sending spam.
This change will send BR_ONEWAY_SPAM_SUSPECT to userspace when oneway
spamming is detected, request to dump current backtrace. Oneway spamming
will be reported only once when exceeding the threshold (target process
dips below 80% of its oneway space, and current process is responsible for
either more than 50 transactions, or more than 50% of the oneway space).
And the detection will restart when the async buffer has returned to a
healthy state.
Acked-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Hang Lu <hangl@codeaurora.org>
Link: https://lore.kernel.org/r/1617961246-4502-3-git-send-email-hangl@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add BR_FROZEN_REPLY in binder_return_strings to support stat function.
Fixes: ae28c1be1e ("binder: BINDER_GET_FROZEN_INFO ioctl")
Acked-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Hang Lu <hangl@codeaurora.org>
Link: https://lore.kernel.org/r/1617961246-4502-2-git-send-email-hangl@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The word 'rung' is a typo in below comment, fix it.
* @event_ring: The event rung index that services this channel
Signed-off-by: Jarvis Jiang <jarvis.w.jiang@gmail.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20210408100220.3853-1-jarvis.w.jiang@gmail.com
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
We need to print a message to the kernel log in case we encounter
an unknown error in the f/w boot to help the user understand what
happened.
In addition, we shouldn't print unknown error in case of known errors.
Moreover, in case of warnings/info, we shouldn't return -EIO that will
fail the initialization and mark the device as disabled
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
update files to latest version from F/W team.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
As part of the securing GAUDI, the F/W will configure the PCI iATU
regions. If the driver identifies a secured PCI ID, it will know to
skip iATU configuration in a very early stage.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
As F/ security indication must be available before driver approaches
PCI bus, F/W security should be derived from PCI id rather than be
fetched during boot handshake with F/W.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
DRAM scrubbing can take time hence it adds to latency during allocation.
To minimize latency during initialization, scrubbing is moved to release
call.
In case scrubbing fails it means the device is in a bad state,
hence HARD reset is initiated.
Signed-off-by: Bharat Jauhari <bjauhari@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In order to minimize hard coded values between F/W and the driver, we
send msi-x indexes dynamically to the F/W.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Clearing QM errors by the driver will prevent these H/W blocks from
stopping in case they are configured to stop on errors, so perform this
clearing only if this mode is not in use.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In case of multiple ECC errors, FW will set the DEVICE_UNUSABLE bit.
On boot-up, the driver will therefore fail inserting the device.
Signed-off-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Prefer the use of strscpy when copying the ASIC name into a char array,
to prevent accidentally exceeding the array's length.
In addition, strlcpy is frowned upon so replace it.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The store part was never implemented in the code and never been used
by the userspace applications.
We currently use the related parameters to a different purpose with
a defined union. However, there is no point in that and it is better
to just remove the union and the store parameters.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
When trying to debug program, the user often needs to
dump large parts of the device's DRAM, which can reach to tens of GBs.
Because reading from the device's internal memory through the PCI BAR
is extremely slow, the debug can take hours.
Instead, we can provide the user to copy data through one of the DMA
engines. This will make the operation much faster.
Currently, only GAUDI is supported.
In GAUDI, we need to find a PCI DMA engine that is IDLE and set the
DMA as secured to be able to bypass our MMU as we currently don't
map the temporary buffer to the MMU.
Example bash one-line to dump entire HBM to file (~2 minutes):
for (( i=0x0; i < 0x800000000; i+=0x8000000 )); do \
printf '0x%x\n' $i | sudo tee /sys/kernel/debug/habanalabs/hl0/addr ; \
echo 0x8000000 | sudo tee /sys/kernel/debug/habanalabs/hl0/dma_size ; \
sudo cat /sys/kernel/debug/habanalabs/hl0/data_dma >> hbm.txt ; done
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Since we moved the SOB reset flow to workqueue and
not part of the fence release flow, we might reach a
scenario where new context is created while we in the middle
of resetting the SOB.
in such cases the reset may fail due to idle check.
This will mess up the streams sync since the SOB value is invalid.
so we protect this area with a mutex, to delay context creation.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
There is a need to allow to user to send command submissions with
custom timeout as some CS take longer than the max timeout that is
used by default.
Signed-off-by: Alon Mizrahi <amizrahi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The new approach is based on the notion that the relative
current power consumption is in relation of proportionality
to device's true utilization.
Utilization info ranges between [0,100]%
Currently, dc_power values are hard-coded.
Signed-off-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In order to use minimum of hard coded values common to LKD and F/W
a dynamic method to work with PLLs is introduced in this patch.
Formerly asic specific PLL numbering is now common for all asics.
To be backward compatible a bit in dev status is defined, if the bit is
not set LKD will keep working with old PLL numbering.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In order to shorten the time cs lock is being held, we move any
possible work outside of the cs lock.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Add a little sleep between page unmappings in case mapping of
large number of host pages failed, in order to
avoid soft lockup bug during the rollback.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Update with latest version from the Firmware team.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The device can get into deadlock in case it use indirect mode for MSI
interrupts (multi-msi) and have hard-reset during interrupt storm.
To prevent that, always use direct mode which means single-msi mode.
The F/W will prevent the host from writing to the indirect MSI
registers to prevent any malicious user from causing this scenario.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In case the BMC of the devices' box wants to initiate a reset of
a specific device, it must go through driver.
Once driver will receive the request it will initiate a hard reset
flow.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In order to have a better debuggability we allow debugfs access
to user mmu mapped host memory. Non-user host memory access will be
rejected.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
fixed the following coccicheck:
./drivers/misc/habanalabs/common/sysfs.c:347:60-61: WARNING opportunity
for kobj_to_dev()
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Update to the latest version of the file as supplied by the F/W.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
if reset is due to heartbeat, device CPU is no responsive in which
case no point sending PCI disable message to it.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
As there are incorrect assumptions in which some of the
initialization and data path flows cannot sleep, most allocations
are being done using GFP_ATOMIC.
We modify the code to use GFP_ATOMIC only when realy needed, as
sleepable flow should use GFP_KERNEL.
In addition add a fallback to allocate memory using GFP_KERNEL,
once ATOMIC allocation fails.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Update to the latest definition of the firmware
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Add driver implementation for reading the current power from the device
CPU F/W.
Signed-off-by: Sagiv Ozeri <sozeri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>