Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for front panel port prefix regex #1

Open
wants to merge 48 commits into
base: master
Choose a base branch
from

Conversation

itamar-talmon
Copy link
Owner

Description

Motivation and Context

How Has This Been Tested?

Additional Information (Optional)

vdahiya12 and others added 4 commits March 30, 2022 16:17
…nic-net#263)

This release goes in sync with the following firmware version of Broadcom Y cable, which is consistent with release 8
version : { "nic": "D103.2_D208.3", "tor_a": "D308.3", "tor_b": "D308.3" }


Description
Basically a vendor specific implementation of abstract YCableBase class .
detailed design discussion can be found https://github.com/Azure/SONiC/pull/757/files

Motivation and Context
to support the Y-Cable API required to support Broadcom's Y-Cable.

How Has This Been Tested?
put the changes in PMON, and all API's seem to run OK.

Signed-off-by: vaibhav-dahiya <[email protected]>
sonic-net#271

Signed-off-by: Kebo Liu [email protected]

Description
Judge the self._parse_re return value, if it's N/A then stop further string slice handling to avoid a crash.

Update the regular expression pattern for Innodisk SSD health, to handle the case that the output of the health section is different when its lifetime reaches the end.

Motivation and Context
Original code doesn't handle the case that self._parse_re returns N/A, it's assuming that self._parse_re will always return a none N/A value thus further handling the result w/o judge, which could in a crash.

On Innodisk SSD, when the SSD remaining lifetime reaches the end, the output of the Health section will be a number w/o %, e.g. Health: 0.00 instead of Health: 95.0% in the normal case, need to update the regular expression to handle this case.

How Has This Been Tested?
UT test has been added.
Tested the change on platforms with different types of SSD.
- Description
Add some new reboot causes to cover followinging scenarios:
BIOS - In case the BIOS upgrade process ended with failure and cause the switch to reset.
CPU - Reset is initiated by SW on the CPU. it could be that SW encountered some catastrophic situation like a memory leak, eventually, the kernel reset the whole switch.
Push button - Reset by pushing the reset button
Reset from ASIC - Reset which is caused by ASIC.
Motivation and Context
Add more reboot causes to cover more scenarios.

- How Has This Been Tested?
UT is added with the code change.
Run community reboot test to see the reboot cause checker can pass.

Signed-off-by: Kebo Liu <[email protected]>
…onic-net#272)

Signed-off-by: vaibhav-dahiya <[email protected]>

Signed-off-by: vaibhav-dahiya [email protected]
Cable could be powered off during which the i2c to the NIC MCU would not be able to respond with which side is active.
For such cases the log needs to be improved.
In case the cable is powered correctly but still the cable is not able to get the actve side, that would mean a faulty cable.
Added/improved the appropriate logs

Description
Motivation and Context
How Has This Been Tested?
Improved logs only
Additional Information (Optional)
@itamar-talmon itamar-talmon force-pushed the front_panel_port_name_regex branch from 7152aea to c6653fc Compare April 27, 2022 13:41
dflynn-Nokia and others added 4 commits May 2, 2022 11:22
…#270)

Description
The code that decodes the content of the ONIE syseeprom includes a flag to
enable/disable displaying the content of the vendor extension TLV. This flag is
currently set to 'disable'. Hence the 'show platform syseeprom' command shows
the presence and size of the vendor extension TLV but does not show its
content. This commit sets the flag to 'enable' so that the vendor extension TLV
content is displayed.
Motivation and Context
The 'show platform syseeprom' command shows that the Vendor Extension TLV is present but does not show its content.
Here's what that looks like on an example platform.
* [CMIS]Fix low-power to high power mode transition

* Remove python2 tests

* Improve code coverage

* Parametrize the test

* Improve code coverage
…percent (sonic-net#279)

Description
The command is not showing the correct value for ssd health.

admin@sonic:~$ show platform ssdhealth
Device Model : M.2 (S42) 3IE4
Health       : N/A
Temperature  : 25C
Motivation and Context
SSD health percentage not displayed on Nokia-7215 platform.

How Has This Been Tested?
"show platform ssdhealth" cli command
Output after fix:

admin@sonic:~$ show platform ssdhealth 
Device Model : M.2 (S42) 3IE4
Health       : 100%
Temperature  : 25C
…hen mux toggle is inprogress (sonic-net#280)

In this PR, there is a support for adding a mux_toggle_status variable inside the base class for mux_cable.
Using this variable the Derived classes for mux_cable can check this and return in case of a mux_toggle_status is in progress.
From the higher layer this allows ycabled to synchronize the calls and not let mux_cable toggle to go in conjunction with some of the Telemetry calls.
Signed-off-by: vaibhav-dahiya [email protected]

Description
Motivation and Context
To get the toggle time to a minimum/ not allow i2c to transactions on the cable to collide with each other

How Has This Been Tested?
Ran the changes on 7050cx3 arista testbed

Signed-off-by: vaibhav-dahiya <[email protected]>
@itamar-talmon itamar-talmon force-pushed the front_panel_port_name_regex branch 3 times, most recently from 92de853 to 4aee9f2 Compare June 1, 2022 07:53
alexrallen and others added 3 commits June 1, 2022 13:58
Description
The heath metric in ssd_generic for innodisk SSDs is too lazy. Fix to match the entire health number rather than just the first digit.

How Has This Been Tested?
Manual testing on Mellanox MSN2100
* Skip CDB and VDM for flat memory modules

* Improve code coverage

* Fix test failure

* Fix test failure

* Fix test failure
@itamar-talmon itamar-talmon force-pushed the front_panel_port_name_regex branch from 4aee9f2 to 8ab675b Compare June 21, 2022 08:27
microsoft-github-policy-service bot and others added 14 commits July 5, 2022 15:36
Co-authored-by: microsoft-github-policy-service[bot] <77245923+microsoft-github-policy-service[bot]@users.noreply.github.com>
* Support get_port_or_cage_type

The API returns the masks of all types of port or cage that can be supported on the port
All the types are defined in sfp_base.SfpBase

Signed-off-by: Stephen Sun <[email protected]>

* Add comments to explain the types

Signed-off-by: Stephen Sun <[email protected]>
When insert trasnceiver which is not QSFP-DD, "sfputil show error-status -hw" would fail
error:
  File "/usr/local/lib/python3.7/dist-packages/sonic_platform_base/sonic_xcvr/sfp_optoe_base.py", line 211, in get_error_description
    return api.get_error_description() if api is not None else None
AttributeError: 'Sff8636Api' object has no attribute 'get_error_description'

Signed-off-by: chiourung_huang <[email protected]>
Description
Add unit test cases to test eeprom_tlvinfo.py, now code coverage is 80%

Motivation and Context
There is no UT available for eeprom_tlvinfo.py previously.
Using a HEX file to mock EEPROM content. Take this mocked EEPROM as input and test the functions.

Signed-off-by: Kebo Liu <[email protected]>
- Description
psud would collect input voltage and input current of the PSU.

- Motivation and Context
more information about the PSU.

- How Has This Been Tested?
unitests, manual tests.

Signed-off-by: orfar1994 <[email protected]>
Description
The original get_uart_stat() will only report the last record of the uart statistic due to all record shared the same object instance.

Motivation and Context
How Has This Been Tested?

Signed-off-by: xinyu <[email protected]>
* catch Exception to avoid CMIS code crash

Signed-off-by: Kebo Liu <[email protected]>

* fix review comments, add more UT test

Signed-off-by: Kebo Liu <[email protected]>

Signed-off-by: Kebo Liu <[email protected]>
Description
Add unit testcases for pcie_common.py
Code coverage improved to 86%.

Motivation and Context
To improve code coverage for pcie_common.py
…PROM is not ready (sonic-net#305)

* get_transceiver_info should return None when cmis cable eeprom is not ready

Signed-off-by: Kebo Liu <[email protected]>

* Add more comments to describe the change

Signed-off-by: Kebo Liu <[email protected]>

Signed-off-by: Kebo Liu <[email protected]>
…nic-net#306)

* update the return for update_firmware api's failure case when the image file doesn't exist

* update comments

* update comment

* update comment
…sonic-net#301)

Signed-off-by: vaibhav-dahiya [email protected]
This PR adds the following API's useful for muxcable MCU's debug, these are added as base class for muxcable API's and implemented by vendor

    def queue_info(self):

        This API should dump all the meaningful data from the eeprom which can
        help vendor debug the queue info currently relevant to the MCU
        using this API the vendor could check how many txns are currently in the queue etc
        for debugging purpose
    def reset_cause(self):

        This API should return the reset cause for the NIC MCU.
        This should help ascertain whether a reset was caused by soft reboot or
        cable poweroff
    def operation_time(self):
        This API should return the time since the cable is powered on from NIC MCU side
        This should be helpful in debugging purposes as to if/when the cable has been powered on
    def mem_read(self):

        This API should return the memory contents/as well as pointers/counters for DMA or hardware 
        FIFO's which could be useful for debugging the state of the MCU
xinyulin and others added 22 commits September 16, 2022 17:50
…net#303)

Description
fix the switch_count_tor_a() will always clear the counter after read issue.
use YCable.EEPROM_ERROR due to the definition was moved.
check and activate firmware by the inactive firmware version individually.
add more error exception for activate_firmware()
authored-by evan-lin [email protected]

Signed-off-by: vaibhav-dahiya [email protected]
* [Cloudlight] QSFP-DD FW upgrade doesn't work (sonic-net#257)

- Description
cdb1_chkstatus will crash when i2c NACK or timeout.

- Motivation and Context
I2C of transceiver might NACK or stretching when FW upgrade, assuming "None" means "CdbIsBusy" until timeout.

* [Cloudlight] QSFP-DD FW upgrade doesn't work (sonic-net#257)

- Description
Waiting a delay in "run_fw_image" to ensure it is really executed.
Return a special package when get none in "get_module_fw_info".

- Motivation and Context
"run_fw_image" will be executed after a delay which according to run cmd, waiting the delay in "run_fw_image" to avoid aother cmd sent before it really executing.
CDB cmds will maybe cause several seconds NACK or stretching on i2c bus depend on implementation of module vendor, handling this situation for compatible with different implementation.

* [Cloudlight] QSFP-DD FW upgrade doesn't work (sonic-net#257)

- Description
Using real length to replace fixed number in "block_write_epl" function.

- Motivation and Context
To avoid a wrong epl length used in module.

* Update unit tests for cmis.

Test : Creating "get_module_fw_info" test.
…nic-net#311)

this PR removes some of the toggle synchronization logic for SONiC telemetry, since SONiC telemetry table MUX_CABLE_INFO is anyways disabled/enabled using CLI
config muxcable telemetry enable/disable, it is redundant to have this logic embedded in ycabled and port_instance helper objects, which could be unneccessary

Signed-off-by: vaibhav-dahiya [email protected]

Signed-off-by: vaibhav-dahiya <[email protected]>
Signed-off-by: maipbui <[email protected]>

Signed-off-by: maipbui <[email protected]>
* read CMIS data path state duration

* 1. Add code coverage
2. reorder entries in regGroupField
…onic-net#316)

- Description
Catch both TypeError and AttributeError in CmisApi::get_application_advertisement because an AttributeError will be thrown when updating a dict with None.

- Motivation and Context
Fix issue found during automation tests

- How Has This Been Tested?
Manually test
Added new unit test

Signed-off-by: Stephen Sun <[email protected]>
…#318)

* Fix issue: copper cable should not display DOM information

* Improve unit test coverage
- Description
For cmis cable, tx power and rx power is not rounding as other eeprom fields

- Motivation and Context
Fix issue: rounding float value for txpower and rxpower

- How Has This Been Tested?
Manual test
Signed-off-by: Mihir Patel <[email protected]>

Signed-off-by: Mihir Patel <[email protected]>
- Description
Deduce SSD vendor name from part number for Virtum

- Motivation and Context
Currently, ssd_generic.py deduce vendor name by smartctl command. For example,

Device Model:     StorFly VSFDM8XC240G-V11-T
"StorFly" is the vendor name. However, for some SSD vendor, smartctl cannot get vendor name. 
For example:
Device Model:     VSFDM8XC240G-V11-T
In such case, vendor name shall be deduced from part number.
…c-net#324)

* [sfp] Add media assignment options to Application Advertisement

* Increase UT coverage
Add warning/critical thresholds for PSU power

Signed-off-by: Stephen Sun <[email protected]>
…ISAR 10G LR XCVR (sonic-net#319)

* JIRA-SONIC-5341: [eBay_ec202111_214] EEPROM/DOM Info: The Compliance Code will show "unknown" by using FINISAR 10G LR XCVR

correct the code mapping

* Fix others error code mapping and add unit test
…t#331)

Description
Upgrade to the build to bullseye
Fix the branch reference issue

Motivation and Context
How Has This Been Tested?
…d in the eepromTlvInfo decode (sonic-net#333)

Description
The VEMDOR_EXT field parse the data and format each in byte in hex with space as separator. But the logic leaves a space at the end of the value. This PR removes the trailing space in output of the VENDOR_EXT field in the eepromTlvInfo decode

This change is needed by 202205 branch

Motivation and Context
The trailing space at end of the VENDOR_EXT field cause the test of function get_system_eeprom_info() failed. The trailing space of the data filed is invisible char in the show platform syseeprom output while the get_system_eeprom_info() return a dictionary with trailing space in the end. This results in a mis-matched in the test case.

Signed-off-by: mlok <[email protected]>
…onic-net#315)

* Add get_transceiver_status to API interface

* Add test_ccmis support

* Add get_transceiver_pm API interface

* Add debug log and more description for interface

* Remove unnecessary pass
Description
Add error handling to get_aux_mon_type API

Motivation and Context
get_aux_mon_type reads field consts.AUX_MON_TYPE in eeprom on page 1 address 145
If memory model is flat, there is no page 1 in eeprom, only 0_lower and 0_upper
Here get_aux_mon_type doesn't check whether memory model is not flat, this causes errors in running xcvrd

How Has This Been Tested?
Tested on testbed, assured that logs of failed eeprom reading disappeared
@itamar-talmon itamar-talmon force-pushed the front_panel_port_name_regex branch 2 times, most recently from f317926 to b80f16a Compare January 12, 2023 08:48
@itamar-talmon itamar-talmon force-pushed the front_panel_port_name_regex branch from b80f16a to 02b013f Compare January 12, 2023 08:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.