← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2080447] [NEW] Update firewall policy interface timeout when it's associated with a large number of rules

 

Public bug reported:

* High level description: When the firewall policy is associated with
thousands of rules, the update firewall
policy(v2.0/fwaas/firewall_policies/<policy_id>) interface becomes very
slow.

* Pre-conditions: The firewall policy is associated with a large number
of rules.

* Step-by-step reproduction steps: 
When the firewall policy is associated with thousands of rules, the update firewall policy(v2.0/fwaas/firewall_policies/<policy_id>) interface becomes very slow. If the client sets the timeout to 60 seconds or 120 seconds, it will cause the client request to timeout.

I added logs to the update firewall policy interface and eventually
found that it took 429 seconds to call the _delete_all_rules_from_policy
function.

log detail:"2024-09-11 16:27:14.749 132 INFO neutron_fwaas.db.firewall.v2.firewall_db_v2 [None req-6af44478-e8e4-411a-939d-dc58daeb5503 c7040248a93044c38791ac91cb4590ab e8b6083c43514540b6686925c5581915 - - default default] _delete_all_rules_from_policy policy_id=3e83c14f-f013-4c6a-a0c7-96271f1fb814 start
2024-09-11 16:34:23.942 132 INFO neutron_fwaas.db.firewall.v2.firewall_db_v2 [None req-6af44478-e8e4-411a-939d-dc58daeb5503 c7040248a93044c38791ac91cb4590ab e8b6083c43514540b6686925c5581915 - - default default] _delete_all_rules_from_policy policy_id=3e83c14f-f013-4c6a-a0c7-96271f1fb814 end, time=429.19302344322205
"

Further logging revealed that _delete_all_rules_from_policy, when
processing, queries the rule information from the database one by one
based on the associated rule IDs and then deletes them. The time to
query each rule is between 0.3 to 0.5 seconds, so for a firewall policy
associated with 1000 rules, the time taken would be around 300 to 500
seconds.

I performed a direct join query from the database, which is extremely
fast and negligible. This might indicate a performance issue with
SQLAlchemy, but there might also be room for performance optimization in
_delete_all_rules_from_policy by querying all associated rules at once
for the firewall policy, instead of querying one by one. For this, I
will submit a patch to optimize it accordingly.

* Version:
  ** OpenStack version: Zed
  ** Linux Kernel: Linux node-1 5.4.119-19.0009.14

** Affects: neutron
     Importance: Undecided
         Status: New

** Description changed:

  * High level description: When the firewall policy is associated with
  thousands of rules, the update firewall
  policy(v2.0/fwaas/firewall_policies/<policy_id>) interface becomes very
  slow.
  
  * Pre-conditions: The firewall policy is associated with a large number
  of rules.
  
  * Step-by-step reproduction steps: When the firewall policy is
  associated with thousands of rules, the update firewall
  policy(v2.0/fwaas/firewall_policies/<policy_id>) interface becomes very
  slow. If the client sets the timeout to 60 seconds or 120 seconds, it
  will cause the client request to timeout.
  
  I added logs to the update firewall policy interface and eventually
  found that it took 429 seconds to call the _delete_all_rules_from_policy
  function.
  
- log detail:"2024-09-11 16:27:14.749 132 INFO neutron_fwaas.db.firewall.v2.firewall_db_v2 [None req-6af44478-e8e4-411a-939d-dc58daeb5503 c7040248a93044c38791ac91cb4590ab e8b6083c43514540b6686925c5581915 - - default default] _delete_all_rules_from_policy policy_id=3e83c14f-f013-4c6a-a0c7-96271f1fb814 start
+ log detail:
+ 2024-09-11 16:27:14.749 132 INFO neutron_fwaas.db.firewall.v2.firewall_db_v2 [None req-6af44478-e8e4-411a-939d-dc58daeb5503 c7040248a93044c38791ac91cb4590ab e8b6083c43514540b6686925c5581915 - - default default] _delete_all_rules_from_policy policy_id=3e83c14f-f013-4c6a-a0c7-96271f1fb814 start
  2024-09-11 16:34:23.942 132 INFO neutron_fwaas.db.firewall.v2.firewall_db_v2 [None req-6af44478-e8e4-411a-939d-dc58daeb5503 c7040248a93044c38791ac91cb4590ab e8b6083c43514540b6686925c5581915 - - default default] _delete_all_rules_from_policy policy_id=3e83c14f-f013-4c6a-a0c7-96271f1fb814 end, time=429.19302344322205
- "
  
  Further logging revealed that _delete_all_rules_from_policy, when
  processing, queries the rule information from the database one by one
  based on the associated rule IDs and then deletes them. The time to
  query each rule is between 0.3 to 0.5 seconds, so for a firewall policy
  associated with 1000 rules, the time taken would be around 300 to 500
  seconds.
  
  I performed a direct join query from the database, which is extremely
  fast and negligible. This might indicate a performance issue with
  SQLAlchemy, but there might also be room for performance optimization in
  _delete_all_rules_from_policy by querying all associated rules at once
  for the firewall policy, instead of querying one by one. For this, I
  will submit a patch to optimize it accordingly.
  
- 
  * Version:
-   ** OpenStack version: Zed
-   ** Linux Kernel: Linux node-1 5.4.119-19.0009.14
+   ** OpenStack version: Zed
+   ** Linux Kernel: Linux node-1 5.4.119-19.0009.14

** Description changed:

  * High level description: When the firewall policy is associated with
  thousands of rules, the update firewall
  policy(v2.0/fwaas/firewall_policies/<policy_id>) interface becomes very
  slow.
  
  * Pre-conditions: The firewall policy is associated with a large number
  of rules.
  
- * Step-by-step reproduction steps: When the firewall policy is
- associated with thousands of rules, the update firewall
- policy(v2.0/fwaas/firewall_policies/<policy_id>) interface becomes very
- slow. If the client sets the timeout to 60 seconds or 120 seconds, it
- will cause the client request to timeout.
+ * Step-by-step reproduction steps: 
+ When the firewall policy is associated with thousands of rules, the update firewall policy(v2.0/fwaas/firewall_policies/<policy_id>) interface becomes very slow. If the client sets the timeout to 60 seconds or 120 seconds, it will cause the client request to timeout.
  
  I added logs to the update firewall policy interface and eventually
  found that it took 429 seconds to call the _delete_all_rules_from_policy
  function.
  
- log detail:
- 2024-09-11 16:27:14.749 132 INFO neutron_fwaas.db.firewall.v2.firewall_db_v2 [None req-6af44478-e8e4-411a-939d-dc58daeb5503 c7040248a93044c38791ac91cb4590ab e8b6083c43514540b6686925c5581915 - - default default] _delete_all_rules_from_policy policy_id=3e83c14f-f013-4c6a-a0c7-96271f1fb814 start
+ log detail:"2024-09-11 16:27:14.749 132 INFO neutron_fwaas.db.firewall.v2.firewall_db_v2 [None req-6af44478-e8e4-411a-939d-dc58daeb5503 c7040248a93044c38791ac91cb4590ab e8b6083c43514540b6686925c5581915 - - default default] _delete_all_rules_from_policy policy_id=3e83c14f-f013-4c6a-a0c7-96271f1fb814 start
  2024-09-11 16:34:23.942 132 INFO neutron_fwaas.db.firewall.v2.firewall_db_v2 [None req-6af44478-e8e4-411a-939d-dc58daeb5503 c7040248a93044c38791ac91cb4590ab e8b6083c43514540b6686925c5581915 - - default default] _delete_all_rules_from_policy policy_id=3e83c14f-f013-4c6a-a0c7-96271f1fb814 end, time=429.19302344322205
+ "
  
  Further logging revealed that _delete_all_rules_from_policy, when
  processing, queries the rule information from the database one by one
  based on the associated rule IDs and then deletes them. The time to
  query each rule is between 0.3 to 0.5 seconds, so for a firewall policy
  associated with 1000 rules, the time taken would be around 300 to 500
  seconds.
  
  I performed a direct join query from the database, which is extremely
  fast and negligible. This might indicate a performance issue with
  SQLAlchemy, but there might also be room for performance optimization in
  _delete_all_rules_from_policy by querying all associated rules at once
  for the firewall policy, instead of querying one by one. For this, I
  will submit a patch to optimize it accordingly.
  
  * Version:
    ** OpenStack version: Zed
    ** Linux Kernel: Linux node-1 5.4.119-19.0009.14

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2080447

Title:
   Update firewall policy interface timeout when it's associated with a
  large number of rules

Status in neutron:
  New

Bug description:
  * High level description: When the firewall policy is associated with
  thousands of rules, the update firewall
  policy(v2.0/fwaas/firewall_policies/<policy_id>) interface becomes
  very slow.

  * Pre-conditions: The firewall policy is associated with a large
  number of rules.

  * Step-by-step reproduction steps: 
  When the firewall policy is associated with thousands of rules, the update firewall policy(v2.0/fwaas/firewall_policies/<policy_id>) interface becomes very slow. If the client sets the timeout to 60 seconds or 120 seconds, it will cause the client request to timeout.

  I added logs to the update firewall policy interface and eventually
  found that it took 429 seconds to call the
  _delete_all_rules_from_policy function.

  log detail:"2024-09-11 16:27:14.749 132 INFO neutron_fwaas.db.firewall.v2.firewall_db_v2 [None req-6af44478-e8e4-411a-939d-dc58daeb5503 c7040248a93044c38791ac91cb4590ab e8b6083c43514540b6686925c5581915 - - default default] _delete_all_rules_from_policy policy_id=3e83c14f-f013-4c6a-a0c7-96271f1fb814 start
  2024-09-11 16:34:23.942 132 INFO neutron_fwaas.db.firewall.v2.firewall_db_v2 [None req-6af44478-e8e4-411a-939d-dc58daeb5503 c7040248a93044c38791ac91cb4590ab e8b6083c43514540b6686925c5581915 - - default default] _delete_all_rules_from_policy policy_id=3e83c14f-f013-4c6a-a0c7-96271f1fb814 end, time=429.19302344322205
  "

  Further logging revealed that _delete_all_rules_from_policy, when
  processing, queries the rule information from the database one by one
  based on the associated rule IDs and then deletes them. The time to
  query each rule is between 0.3 to 0.5 seconds, so for a firewall
  policy associated with 1000 rules, the time taken would be around 300
  to 500 seconds.

  I performed a direct join query from the database, which is extremely
  fast and negligible. This might indicate a performance issue with
  SQLAlchemy, but there might also be room for performance optimization
  in _delete_all_rules_from_policy by querying all associated rules at
  once for the firewall policy, instead of querying one by one. For
  this, I will submit a patch to optimize it accordingly.

  * Version:
    ** OpenStack version: Zed
    ** Linux Kernel: Linux node-1 5.4.119-19.0009.14

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2080447/+subscriptions