Skip to main content

Feb 2, 2023 7:40AM Pacific - Reports of Slowness - Resolved 12:34AM Pacific

7:40AM Pacific we got a few reports of slowness and confirmed that there was some packet loss coming across our firewall. We spoke to our host who confirmed that there is a small-scale DDOS attack occurring and that they were working on mitigating the attack. As of 8:10AM traffic appears to be moving more normally but our host is still working on full mitigation.

UPDATE 10:05AM Pacific: Speed issues appear to be largely resolved but the new rules implemented on our firewall to mitigate the DDOS attack appear to be causing some builds to intermittently not load from specific IP addresses. It does not appear to be widespread but we've confirmed that at least a handful of clinics are periodically unable to connect from specific IP addresses. We're currently working with our host to better understand what might be causing this issue.

UPDATE 1:10PM Pacific: It seems like a small number of users are still having intermittency issues. Overall, performance seems quite good, but we're collecting the IP addresses of users who are reporting connectivity issues to see if we can establish why they're being specifically blocked. As a longer-term solution, we have a network upgrade schedule for 6PM Pacific tonight where our host to do an "interim" upgrade tonight at 9PM Eastern/6PM Pacific. This is a patch to distribute our inbound traffic over a larger number of firewalls but it should let us tone down the aggressiveness of the firewall rules to ensure that we aren't seeing disruption. Parts of our network will be moved behind a different firewall to isolate some higher-volume clients from the rest of our traffic. Some of this load will shift tonight and if everything goes well we'll shift additional load over the next few days.

UPDATE 6:55PM Pacific: Our host is currently in the process of moving parts of our network behind a mirrored firewall. We expect this work to be complete in the next hour or so. In the meantime we're still seeing some intermittency but once the switch is complete we'll be running a few more tests. If you're still having issues with intermittent unavailability please contact support@cer.bo

UPDATE 9:15PM Pacific: Traffic to our US data center is now successfully load-balancing between multiple firewalls. We will begin slowly moving more traffic onto the newly deployed firewall as we become more comfortable that it is performing as expected. Our host is reporting that there are some remaining issues at the data-center level related to the host's Edge network (outside the Cerbo private cloud) that they are currently working on, but moving forward our own firewall set-up should be more resilient. We'll continue to post updates until our host has confirmed they've resolved everything.

UPDATE 12:34AM PACIFIC: Intermittency issues are confirmed as resolved. Our host found a faulty load-balancing route and was causing some traffic to not route correctly just after their edge network. This has been repaired and all systems are behaving normally.

Feb 1, 2023 11:14AM Pacific - Reports of Slowness - Resolved 11:57M Pacific

We had intermittent reports of clinics having latency issues for ~40 minutes. It didn't appear to impact all clinics, and it appears to have popped up fairly randomly (speeds would be high for the vast majority of calls but periodically a specific attempt to load a resource would take several seconds to resolve). Our host has made a firewall adjustment that appears to have stabilized the issue but we're still monitoring the issue and will be meeting with them to get an update on a scheduled replacement of our firewall infrastructure that is currently being built out.

Jan 10, 2023 11:38AM-11:48AM Pacific, 12-12:05PM Pacific - Brief outage occurred on a single server note, affecting ~4% of builds

A faulty power cable that powers one server node briefly knocked that offline. This affected ~4% of client EHRs, which were inaccessible for about 15 minutes. The power cable was replaced, and appears to be operating normally.

Jan 9, 2023 7:17AM-12:00PM Pacific - Slow connectivity reported - Resolved

11:30AM Pacific: Our host was able to migrate our firewall infrastructure to a new environment that appears to be able to handle the spike in DDOS traffic dramatically better. We're seeing latency rates return to normal, and you should not notice any significant lagging. We've also had them re-enable SFTP traffic so office scanners should be working again. We're continuing to monitor and will be posted a longer debrief once we've been able to properly summarize this event.
10:30AM Pacific: We've continued to communicate with host about the ongoing performance issues. It appears that this is related to a DDOS attack that is causing our geographic IP filters to become overwhelmed, but they are still looking at a work-around. They may restart the firewall (2-4 minutes of downtime) as part of these new settings. We sincerely apologize for the ongoing performance issues. We know that it makes it difficult to work when pages are taking 20 seconds to load, and we're doing everything in our power to mitigate this issue.
9:45AM Pacific: - our host's networking team restarted the firewall (resulting in ~1 minute of downtime) but it does not appear to have make a significant impact on performance. They're continuing to investigate.
9:15AM Pacific: - Our host migrated some of our firewall infrastructure to a new node which has improved speeds, but latency is still worse than normally expected. They're continuing to investigate other complicating factors.
8:54AM Pacific: Our host believes they've identified the cause of the issue and will performing a live migration of some of our firewall infrastructure. They believe that this will resolve the issue. We should have an update in 15-20 minutes.
8:17AM Pacific: Our host has been working on mitigating an issue with our firewall load that we're seeing improvement in overall speeds, but the issue is not yet resolved. Our host should be supplying an update shortly.

Dec 21, 2022 8:20AM-12:05PM Pacific - Slow connectivity reported - Resolved

This issue appears to have been caused by the GeoIP blocking Firewall service getting overloaded by a DDOS attack on one of our IP sub-ranges. Our host has adjusted some settings that should reduce this load but we'll be restarting the firewall at 10:15PM Pacific time tonight and are scheduled for a major firewall replacement early next month.

Oct. 18, 2022 9:03-10:38am Pacific - Slow connectivity Reported - Resolved

High CPU load on our primary firewall caused some builds to be intermittently slower than normal. The source of this CPU load was identified and all services appear to be running at normal speeds again

Oct. 13, 2022 9:06-9:47am Pacific - Reported latency - Resolved

Hint Health experienced issues this morning that caused clinics linked to Hint to see increased latency when loading certain pages. Hint is working on this issue and will be posting updates here: https://status.hint.com/

Sept 29, 2022 Potential impact from Hurricane Ian - no incident to report

Thankfully, there is no incident to report, but we continue to monitor Hurricane Ian's progress through Florida, where our primary servers are located. About a third of the area as a whole is without power, but the server data centers are powered, with backup generators on standby. Networks are up and running and data is flowing normally.

Sept 21, 2022 12:10-12:35pm Pacific - ~10% of builds reported connectivity issues - Resolved

A server node locked up at 12:10pm Pacific time, causing the builds that are hosted on that node to become inaccessible. We worked with the server company to get the builds back online. Services began coming back online at 12:30pm.

August 12, 2022 10:30am - Bluefin/Pax Chrome Issues - Resolved

Bluefin had confirmed that the most recent version of Chrome is causing issues with the SaasConex system causing Pax A80 devices to routinely fail to report the result of a transaction back to Cerbo. This can result in a payment going through but not registering in Cerbo. Bluefin is working on the issue but in the meantime we should recommend that users switch to Firefox pending a confirmed resolution if they use the A80 devices for check out.

July 19, 2022 10-10:40am Pacific - Connectivity and slowness issues - Resolved

Widespread slowness issues were reported, and even instances of users being unable to log in/ access their EHR, starting around 10am Pacific time. This was caused by the ePrescribing relay server locking up, but part of the service still reporting as live. This caused a massive number of connections to be held open on out networks, putting the Firewall under heavy load and causing slowness across most of our network. Some clinics experienced lagging, while others were unable to connect at all for several minutes. We have resolved the issue, and will continue to work with our server company to better understand the issue and prevent it in the future.

June 22, 2022 ~1:14pm - Interfax outage - Resolved

Interfax is reporting an outage. We have pushed a fix for the Sent tab in the meantime.

May 2, 2022 ~1:30-2:15pm Pacific - Error Sending Emails - Resolved

Some clients are experiencing errors sending emails (which may also affect adding appts with the email notice enabled). This may have to do with a fix put in place to remedy the Chrome security certificate bug. We are actively investigating.

May 2, 2022 ~11:12am-1:25pm Pacific - Chrome Error Generating Certificate - "Your connection is not secure" - Resolved

A recent update to the Chrome browser is causing an issue with the certificate generation. The connection IS secure via HTTPS - this is just an issue with Chrome not recognizing the valid security certificate. Pending a solution, please try using a different browser (Firefox or, for Mac users, Safari).

Mar 1, 2022 ~9:40am-12:00pm - Twilio issues: Creating or joining telemedicine calls - Resolved

Twilio (the network behind Cerbo's integrated telemed functionality and appointment reminder SMS messages) is experiencing failures with creating or joining video call rooms. More information here: https://status.twilio.com/incidents/rcbm2q15rlkl. As of 12pm Pacific time, this appears to be resolved for new calls. Trying to join a call that you had previously tried to join during the outage may still not work.

Mar 30th, 7:40AM Pacific - Quest's Webservices Servers Outage (Update 7:52AM - Resolved)

Relaying orders to/from Quest appears to be down (and issue on Quest's end) and submitting an order may lock up your browser as it attempts to communicate with Quest. If your browser is locked up after trying to send an order to Quest you may need to completely restart your browser. Please refrain from sending Quest orders for at least an hour while we wait to hear back from Quest.

UPDATE 7:52AM: We're seeing Quest's servers coming back up now, though connections are slower than usual. We don't have confirmation that everything is fixed but it seems like connections to Quest should be working again.

UPDATE 8:05 AM: Quest continues to improve operational speed and is within tolerable limits now. No official word from Quest but we're going to mark this as "resolved" for now.

May 2nd, morning Pacific - Comcast issues causing connectivity problems for some Cerbo users in California and Oregon

Some users in California and Oregon appear to be having intermittent issues connecting to Cerbo. This seems to be an issue with one of the west-coast Comcast/Xfinity networks and is not a Cerbo issue. If you have an alternative network or can tether on your phone, you should be able to bypass Comcast's problematic network

Jan 13, 2022 ~11:20AM - SureScripts e-prescribing network - Resolved

SureScripts has resolved e-prescribing network connectivity issues.

Jan 5, 2022 ~11:00am Pacific - EHR outages reported - Resolved

One of the nodes hit a critical disk space usage warning and shut down. The cloud team is looking into it now and are hoping to have it back up soon. We will provide update here as we receive them.

Oct 21, 2021 11:54AM - Amex outage - Resolved

Issue: the American Express credit card processing network is experiencing an outage that is affecting clients trying to charge Amex cards through our integrated merchant services processors. See status.bluefin.com for the most up-to-date updates on this issue.

Aug 12, 2021 8:26-11:06AM - EPCS routing problems - Resolved

Issue: problems with routing controlled substances. Our eRx two-factor verification partner encountered server issues with routing controlled substances but they have confirm sends are now processing successfully.

July 21, 2021 9:26-10:40AM - SMS sending issues - Resolved

Issue: outbound SMS messages failing to send. Failure was addressed by the carrier.

June 1, 2021 7:12-10:36AM - Zoom phone issues - Resolved

Zoom identified the issue causing latency with Zoom Phone service when accessing the web portal and SMS messaging. They will continue to provide updates https://status.zoom.us/ but currently seems to be mostly resolved.

May 10, 2021 7:15-8:10AM Pacific - Latency Issues - Resolved

Issue: latency/ slowness. Reports of latency this morning appear to be resolved after a small-scale DOS attack on our firewall this morning. Our host migrated our primary firewall and implemented some geographic blocking rules which appeared to resolve the issue.

April 13, 2021 5:30AM - 2:48PM - Latency - Resolved

Issue: Latency/ slowness affecting ~20% of users. Our host successfully repaired the root cause of the latency that around 20% of our users were experiencing today. A physical interface issue on one of the data-center's network switches was causing an unacceptable level of packet loss on outbound traffic for nodes that were connected through the faulty switch. This interface has been removed from the network and replaced and traffic speeds immediately returned to normal. We're waiting on a full post mortem from our host's network team, but in the interim load times should be dramatically faster.

March 25, 2021 9:21-9:54AM - Latency - Resolved

Issue: low-level latency/ slowness affecting the majority of clients. Most users would not have noticed this, but some services were moving slower than usual. Our host rolled out an update to address firewall issues that resolved the problem.

March 22, 2021 - performance issues related to Active Campaign integration - Resolved

Customers who have Active Campaign integrated into their Cerbo accounts were reporting performance issues. We disabled the integrations until Active Campaign's performance recovered. Then re-enabled the syncs.

Jan 20, 2021 8:00-8:12AM - Inbound fax errors - Resolved

Our Fax carrier's API was returning errors, but it appears that they've resolved the issue. If you saw red error messages on some fax-related screens (inbound fax queue, sending a new fax, and loading the Sent document tab), those messages should disappear when you refresh the screen.

Dec 16, 2020, 9:05-11:00AM - Latency - Resolved

Some builds were experiencing intermittent slow performance. The issue was caused by significantly higher than normal traffic on our central servers, which caused a slowdown at the firewall. We will be upgrading the firewall after hours tonight to avoid a recurrence of the issue. Slowdown issues appear to have resolved as of about 11am Pacific, but we will continue to monitor this issue throughout the day. In the meantime, anyone who runs into slowness should find that the issue resolves itself within a few minutes.

July 28, 2020, 10:10-11:35AM - Outbound email issues - Resolved

Primary delivery service (Sendgrid) was experiencing API issues that were causing outbound emails to be rejected a small percentage of the time. We temporarily rolled over to an alternate mail service. And rolled back to the primary when the issue was resolved.

Jun 12, 2020 7:44-8:00AM - Brief EHR outages for ~15% of builds - Resolved

A server node rebooted unexpectedly, causing outages of between 2 and 15 minutes. This is the same node that experienced an outage last week, and the root cause is still unclear. So we are migrating all builds off of the problematic node.

Jun 4, 2020 2:13-2:35PM PST - Brief EHR outages for ~15% of builds - Resolved

Handful of outages in one server node. Cause was not able to be determined at the time. Actions taken to prevent recurrence: several builds were migrated off of this node to free up additional space/ resources, and additional monitoring was implemented.

Apr 21, 2020 8:30AM-10AM - Major internet carrier issues (not Cerbo-related) - Resolved

One of the major cross-country data lines is experiencing an outage, which is causing speed and connection issues for many connecting from western states - Colorado in particular (that is where the outage is located), but also clients whose internet connection routes through Colorado.Our hosting company dropped their Level 3 routes to force connection into the Cogent network and this appears to have resolved the issue by around 10AM PST.

Apr 4, 2020 11:40-12:08PM - EHR outage affecting ~10% of builds - Resolved

The same node shut down again, impacting ~10% of our clinics. Engineers determined that it was a faulty DIMM chip on the affected node and did an emergency hardware swap to replace the system RAM. All services should be restored and engineers are monitoring the host.

Feb 3, 2020 4:09-4:41PM - EHR outage affecting ~10% of builds - Resolved

Cloud server node went down, causing outage for some customers.

Jan 6, 2020 - Payment processing issues - Resolved

Bluefin servers are down resulting in failed credit card payment processing for some customers.

Nov 4th, 2019 - Performance issues affecting less than 10% of builds - Resolved

High CPU load on one node. Performance issues resolved, migration scheduled for after hours 10/4/2019 for permanent resolution.