Conversational AI Cloud Publication unavailable
Incident Report for CM.com
Postmortem

Version 4.0 of Conversational AI Cloud is introduced to prepare the service for the high growth we currently experience. To increase performance and scalability we improved the way the system handles your content, the heart of an effective conversational agent.

Originally the release was scheduled during a maintenance window of three days, from April 29th 09:00 up to May 1st 23:59. Test results in the final stages of the release indicated that the Publication process did not work correctly. Incorrectly translating content in the CMS into the production chat logic. Which would lead to bots and agents responding to end users in unexpecting and incorrect ways. Therefore, it was decided on Sunday May 1st at 23:15 to extend the maintenance into Monday May 2nd and keep the Publication disabled.

Throughout Monday, and Tuesday the team discovered additional issues with the publication process that required rework and additional testing. Several customers also experienced problems with other features of Conversational AI Cloud, such as creation of new QA’s overwriting existing and problems with the Dialogs that prevented a hand-over to live chat. At that point it was decided to escalate the maintenance window into a P2 incident.

On Wednesday May 4th at 23:27 all the issue had been resolved, thoroughly tested and publication was re-enabled for customers.

We regret the course of events during the previous week and apologize for the inconvenience. A lot of you have been wanting to publish changes to active content and it took too long for that to be enabled. We take this very seriously and are taking actions to reduce risks with complex Conversational AI Cloud product releases and improve incident communication.

  • We will review and update our communication approach for future releases, maintenance windows, and incidents, and to bring this on par with other CM.com solutions. This page (status.cm.com) will be the central hub of operational product communication.
  • And we will review and improve our test and release procedures to ensure there possible issues will be discovered earlier in the process.
Posted May 10, 2022 - 17:02 CEST

Resolved
This incident is resolved. We will follow up individual issues with customers directly.
Posted May 05, 2022 - 10:32 CEST
Monitoring
Publishing has been enabled for almost all customers. (If it is not enabled on your account, a CSM will contact you tomorrow morning. You will also be enabled in the course of the morning). We are monitoring traffic for any odd behaviour.

We sincerely apologise for all the inconvenience caused. And our team will review the incident for causes and learnings.
Posted May 04, 2022 - 23:27 CEST
Update
Tests are successful. We will begin releasing the updates across our customer base soon. While we carefully monitor the traffic and behaviour. The aim is to restore publishing functionality for all customers tonight.
Posted May 04, 2022 - 20:15 CEST
Update
Our engineering team is still in the testing process. We will keep you informed on our progress.
Posted May 04, 2022 - 18:21 CEST
Update
The team has fixed the issues with the publishing process and is currently re-entering the testing phase. This is necessary to ensure that these fixes will not introduce new issues and the publishing will work correctly.

The issues that some customers experience with their Question/Answers being overwritten has also been addressed. Any overwritten content will also be restored.

Thank you for your patience, we will keep you informed on our progress.
Posted May 04, 2022 - 17:17 CEST
Update
The team is making significant progress reviewing and testing every step within publishing process to resolve issues surrounding release 4.0
Posted May 04, 2022 - 14:35 CEST
Update
Our engineering team is still working around the clock to re-enable the publication function within the Conversational AI Cloud CMS. During regression tests, as part of the major release last weekend, the publication function was found to function incorrectly. This means that changes made within the CMS can not be ensured in the staging and production version of bots. This is an unexpected side effect of the release and the team is reviewing and testing every step within publishing process to resolve issues. CM.com network operations is working with the engineering teams to assist where possible and to keep you updated regularly.

Bots and conversational agents currently deployed on production remain operational without any issues, and other CMS functions are also available.
Posted May 04, 2022 - 12:23 CEST
Update
The Conversational AI Cloud Team has identified the issue preventing Publication to be enabled.
The Team is working on a solution.
Posted May 03, 2022 - 10:20 CEST
Update
Our team is working hard on finalizing the last stages of release 4.0.
Today we encountered some unexpected issues forcing us to keep the publication disabled.
We understand this may be inconvenient and are working hard to get all features back up and running.
Posted May 02, 2022 - 17:33 CEST
Identified
The issue with Transactional Dialog's is identified and fixed
Posted May 02, 2022 - 12:15 CEST
Update
We're investigating issues with Conversational AI Cloud’s Production - API
in some cases Transactional Dialog return errors when invoked
Posted May 02, 2022 - 11:43 CEST
Investigating
Our team is still working hard on finalizing the last stages of release 4.0.
All systems are back online, but as the team is running its final tests publication is disabled.
We understand this may be inconvenient and are working hard to get all features back up and running.
Posted May 02, 2022 - 10:28 CEST
This incident affected: Conversational AI Cloud (Production Gateway, CMS).