Fixing A Chardata Size Bug In SONiC's SAI Implementation
Hey everyone! Let's dive into a bug I found while working with the SONiC (Software for Open Networking in the Cloud) and SAI (Switch Abstraction Interface) stack. Specifically, there's an issue related to how the code handles the size limitations for chardata
attributes when creating new SAI objects. This bug causes unnecessary errors, and I'm here to break down the problem, explain the root cause, and suggest a fix. This is important for those of us working with SONiC, especially when configuring attributes of char
type.
The Problem: Incorrect Size Limitation for Chardata
So, here's the deal: I was trying to create a new SAI object with a MANDATORY_ON_CREATE char
type attribute in SWSS (SONiC's Switch State Service). I set the attribute value to 'OCM0-0|0|191262500'. However, when the code called the SAI create function, it failed with the error "SAI_ATTR_VALUE_TYPE_CHARDATA host interface name is too long." This was a head-scratcher because the value I provided wasn't particularly long, and I knew it should fit. The issue is the size check. The code incorrectly assumes that all chardata
attributes must adhere to SAI_HOSTIF_NAME_SIZE
(16), which is designed specifically for things like host interface names.
This causes issues because other chardata
types, such as names for OTN OCM channels, can be longer than 16 characters. This is a problem because the code is designed to work with a wider range of data.
Let's make it simple: imagine you're trying to enter a longer name than it allows. This is what happened to me and the system does not work as expected. Specifically, the error occurs in meta_generic_validation_create
. This function, as the name suggests, performs a generic validation for object creation. The root of the issue lies in meta.cpp
, where the size check is implemented:
sai_status_t Meta::meta_generic_validation_create(...)
{
...
switch (md.attrvaluetype)
{
case SAI_ATTR_VALUE_TYPE_CHARDATA:
{
const char* chardata = value.chardata;
size_t len = strnlen(chardata, SAI_HOSTIF_NAME_SIZE);
if (len == SAI_HOSTIF_NAME_SIZE)
{
META_LOG_ERROR(md, "host interface name is too long");
return SAI_STATUS_INVALID_PARAMETER;
}
...
}
}
As you can see, the code uses SAI_HOSTIF_NAME_SIZE
(16) for all string attributes. This should only apply to the host interface and multicast group names, which is not what is desired.
SAI Definition and the Incorrect Implementation
According to the SAI specification, the maximum size for a host interface name is indeed 16 characters. However, other string attributes, such as the name
attribute for an OTN OCM channel, can potentially be longer. The SAI defines the maximum length in saihostif.h
:
/**
* @brief Defines maximum host interface name
*/
#define SAI_HOSTIF_NAME_SIZE 16
/**
* @brief Defines maximum length of generic netlink multicast group name
*/
#define SAI_HOSTIF_GENETLINK_MCGRP_NAME_SIZE 16
But, in the sai_attribute_value_t
union, chardata
has a size of 32, which is more than the check, indicating the intent to support longer strings:
typedef union _sai_attribute_value_t
{
/** @validonly meta->attrvaluetype == SAI_ATTR_VALUE_TYPE_BOOL */
bool booldata;
/** @validonly meta->attrvaluetype == SAI_ATTR_VALUE_TYPE_CHARDATA */
char chardata[32];
/** @validonly meta->attrvaluetype == SAI_ATTR_VALUE_TYPE_UINT8 */
sai_uint8_t u8;
My error log clearly shows the problem. The error messages "host interface name is too long" appear when creating an OTN OCM channel, where the attribute's intended length exceeds 16 characters. This indicates that the implemented check does not meet the requirements.
2025 Oct 9 20:57:30.033605 sonic NOTICE swss#orchagent: :- translateObjectAttr: translateObjectAttr, field = name, value = OCM0-0|0|191262500
2025 Oct 9 20:57:30.033605 sonic NOTICE swss#orchagent: :- translateObjectAttr: translateObjectAttr, field = upper_frequency, value = 191337500
2025 Oct 9 20:57:30.033605 sonic ERR swss#orchagent: :- meta_generic_validation_create: SAI_OTN_OCM_CHANNEL_ATTR_NAME:SAI_ATTR_VALUE_TYPE_CHARDATA host interface name is too long
2025 Oct 9 20:57:30.033605 sonic ERR swss#orchagent: :- createObject: Failed to create SAI_OBJECT_TYPE_OTN_OCM_CHANNEL|OCM0-0|0|191262500, rv=-5
2025 Oct 9 20:57:30.033605 sonic ERR swss#orchagent: :- doTask: Failed to create object
2025 Oct 9 20:57:30.035943 sonic INFO swss#supervisord: message repeated 12 times: [ orchagent ]
2025 Oct 9 20:57:30.035943 sonic INFO swss#supervisord: orchagent terminate called after throwing an instance of 'std::runtime_error'
2025 Oct 9 20:57:30.035943 sonic INFO swss#supervisord: orchagent what(): :- doTask: Failed to create object
2025 Oct 9 20:57:30.363750 sonic INFO swss#supervisord 2025-10-09 20:57:30,363 WARN exited: orchagent (terminated by SIGABRT (core dumped); not expected)
The Solution: Tailoring the Size Check
The fix involves modifying the size check within meta_generic_validation_create
to only apply SAI_HOSTIF_NAME_SIZE
to attributes that explicitly require this size limitation. For other string attributes, the check should use the defined maximum size for chardata
, which is 32 characters.
Here’s how we can approach the fix:
- Identify Specific Attributes: The code needs to determine which string attributes are host interface names or multicast group names and apply the 16-character limit. For all other string attributes, we should use the defined
chardata
size (32). - Modify
meta_generic_validation_create
: Adjust the size check within theSAI_ATTR_VALUE_TYPE_CHARDATA
case. The code should check the attribute type (md.attr_type
) to determine the appropriate size limitation. It's necessary to have a mechanism to identify when to use theSAI_HOSTIF_NAME_SIZE
versus the fullchardata
size.
Here is a suggestion:
sai_status_t Meta::meta_generic_validation_create(...)
{
...
switch (md.attrvaluetype)
{
case SAI_ATTR_VALUE_TYPE_CHARDATA:
{
const char* chardata = value.chardata;
size_t max_len = 32; // Default to chardata size
// Check if it's a host interface name or multicast group name, apply the limit
if (md.attr_id == SAI_HOSTIF_ATTR_NAME || md.attr_id == SAI_HOSTIF_ATTR_GENETLINK_MCGRP_NAME)
{
max_len = SAI_HOSTIF_NAME_SIZE;
}
size_t len = strnlen(chardata, max_len);
if (len == max_len)
{
META_LOG_ERROR(md, "String is too long");
return SAI_STATUS_INVALID_PARAMETER;
}
break;
}
}
...
}
This revised code ensures that the size check is correctly applied only to attributes that need it, allowing for other strings of up to 32 characters.
Conclusion: Improving SONiC's SAI Implementation
So, there you have it! We've uncovered and addressed an important bug in SONiC's SAI implementation, specifically the incorrect size limitation for chardata
attributes. By pinpointing the root cause in meta.cpp
and proposing a targeted fix, we can prevent errors, enhance the stability and flexibility of the system, and ensure that the SAI create functions operate correctly for various string attributes. This fix is crucial for network engineers and developers using SONiC, and implementing it will improve the overall reliability of the SONiC platform.
This is a classic example of how even seemingly small bugs can have a significant impact on a system's functionality. By understanding the details of the SAI specification and the implementation, we can work to resolve these issues, which is essential for the robustness and reliability of the network.
Thanks for reading, and I hope this helps you in your SONiC and SAI endeavors! If you find other interesting bugs or have any questions, feel free to share them!