DCNM VRF Error: Creating VRF Without Vrf_id On ND 4.1

Oct 29, 2025 by SLV Team 54 views

Hey guys! Today, we're diving into a specific issue encountered while using the dcnm_vrf module with Ansible and Cisco DCNM (Data Center Network Manager). Specifically, this problem arises when trying to create a VRF (Virtual Routing and Forwarding) instance without explicitly specifying a vrf_id on a Cisco Network Device (ND) running version 4.1. Let's break down the problem, explore the context, and understand how to tackle it.

The Problem

So, the main issue is that when you attempt to create a VRF using the cisco.dcnm.dcnm_vrf Ansible module without including the vrf_id parameter in your playbook, the process fails with a cryptic error message. This contrasts with the behavior observed in older DCNM versions (like 3.2), where the system would automatically assign the next available vrf_id if one wasn't provided in the configuration. This can be a real head-scratcher, especially if you're used to the older behavior or if you're trying to automate VRF creation across different DCNM versions.

The error message you'll likely encounter looks something like this:

fatal: [ND]: FAILED! => {"changed": false, "module_stderr": "'NoneType' object is not subscriptable", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error"}

This error isn't super informative at first glance, but it hints at a problem within the dcnm_vrf module's code, specifically in how it handles the absence of a vrf_id during VRF creation. It suggests that the module is expecting a value that isn't there, leading to a NoneType error when it tries to access it.

Root Cause Analysis

The investigation points to a specific function within the dcnm_vrf.py module called diff_merge_create. The code responsible for querying the next available vrf_id seems to be working fine. However, the subsequent logic that handles VRF creation when a vrf_id is not explicitly provided differs from the path taken when a vrf_id is present in the playbook. This discrepancy leads to a 500 error being returned from DCNM, ultimately causing the Ansible task to fail.

In essence, the module's logic for automatically assigning a vrf_id in the absence of one being specified isn't functioning correctly in DCNM 4.1. This could be due to changes in the DCNM API, modifications in the expected request format, or simply a bug in the module's code that wasn't present in earlier versions.

Reproducing the Issue

To reproduce this issue, you'll need the following:

An Ansible environment with the cisco.dcnm collection installed (version 3.9.1 in this case).
A Cisco ND environment running DCNM version 4.1.1g.

The following Ansible playbook:

- name: NDFC Playbook Add VRF
  hosts: ND
  gather_facts: no
  tasks:
  - name: Add VRF
    cisco.dcnm.dcnm_vrf:
      fabric: pod-1
      state: merged
      config:
      - vrf_name: test
        vrf_template: Default_VRF_Universal
        vrf_extension_template: Default_VRF_Extension_Universal

Note: Make sure to replace pod-1 with the actual name of your fabric in DCNM.

When you run this playbook, you should observe the aforementioned error, indicating that the VRF creation failed because the vrf_id was not provided.

Workarounds and Solutions

So, what can you do to get around this issue? Here are a few options:

1. Explicitly Provide `vrf_id`

The simplest workaround is to explicitly include the vrf_id parameter in your Ansible playbook. This ensures that the module takes the code path that does work correctly in DCNM 4.1.

First, you'll need to determine the next available vrf_id in your DCNM environment. You can do this by either:

Checking the DCNM GUI.
Using the DCNM API to query existing VRFs and find the highest vrf_id in use.

Once you have the next available vrf_id, modify your playbook to include it:

- name: NDFC Playbook Add VRF
  hosts: ND
  gather_facts: no
  tasks:
  - name: Add VRF
    cisco.dcnm.dcnm_vrf:
      fabric: pod-1
      state: merged
      config:
      - vrf_name: test
        vrf_id: <your_next_available_vrf_id>
        vrf_template: Default_VRF_Universal
        vrf_extension_template: Default_VRF_Extension_Universal

Replace <your_next_available_vrf_id> with the actual value you obtained.

While this workaround solves the immediate problem, it's not ideal for fully automated deployments, as it requires you to manually determine the next available vrf_id. However, you can use Ansible's uri module to query the DCNM API, get the max vrf id and increment that in your playbook.

2. Modify the `dcnm_vrf` Module (Advanced)

Disclaimer: This approach involves modifying the Ansible collection code, which is generally not recommended unless you're comfortable with Python and have a good understanding of the module's inner workings. Always back up your original files before making any changes.

If you're feeling adventurous, you can attempt to modify the dcnm_vrf.py module to correctly handle the case where vrf_id is not provided. This would involve:

Identifying the exact location in the diff_merge_create function where the error occurs.
Examining the code path taken when vrf_id is present and understanding how it differs from the path taken when it's absent.
Modifying the code to ensure that the correct API calls are made to DCNM to automatically assign a vrf_id when one is not provided.

This approach requires a solid understanding of the DCNM API and the dcnm_vrf module's code. It's also important to test your changes thoroughly to ensure that they don't introduce any new issues.

3. Contribute to the `cisco.dcnm` Collection

The best long-term solution is to contribute a fix to the cisco.dcnm collection on Ansible Galaxy. This ensures that the issue is resolved for everyone and that your changes are properly tested and maintained.

To do this, you can:

Fork the cisco.dcnm collection repository on GitHub.
Implement the fix in your forked repository.
Submit a pull request to the main cisco.dcnm repository.

This allows the maintainers of the collection to review your changes and incorporate them into a future release.

Conclusion

So there you have it! The issue of VRF creation failing without a vrf_id in DCNM 4.1 when using the dcnm_vrf module can be a frustrating one. However, by understanding the root cause and implementing one of the workarounds or solutions described above, you can overcome this hurdle and continue automating your network deployments. Remember to always test your changes thoroughly and consider contributing back to the community to help others who may encounter the same problem. Happy automating!