Saturday, January 14, 2012

brain retrieval and data file analysis

Again the background: the Saipan CREWS station lost communications (October 2nd), an attempt was made to power-cycle the cellular modem (October 4th or 5th) followed by the retrieval of that modem (November 14th or 15th) for testing and evaluation, a land-based test of the modem near the station (November 21st) and the reinstallation of the cellular modem at the station (November 28th or 29th). This was followed by an unsuccessful attempt to connect to the station by radio (December 15th).

The only diagnostic step left was to retrieve the station's control unit (or "brain"). This unfortunately would have the effect of shutting down all station operations except for the PacIOOS-supplied CTD, which has its own battery backup. However at this point we still had no assurance whatsoever that the station had continue to operate in any capacity after its initial loss of communications on October 2nd.

On December 19th (though mindful of the disruptive nature of the holiday season) I sent out detailed instructions to David Benavente (Coastal Resources Management, Saipan) and Steven Johnson (Division of Environmental Quality, Saipan) on how to safely disconnect all power and instruments at the station and remove the brain. On January 3rd I followed up with instructions on how to recover the station's locally-stored data files once the brain had been retrieved.

On Thursday, January 12th at 8:30am (Saipan time) Steven reported some good news:
David and I were able to retrieve the brain out of the station yesterday. We will start doing some preliminary trouble shooting today. We will keep you posted on our progress.
This was followed later on at 3:39pm with a message from David saying:
So I connected to the control unit and downloaded data for the first TAB0/ TAB1 and TAB2. I uploaded them on to the AOML ftp site. I didn't have to enter a username when uploading the files so I wasn't sure whether they had gotten through. Just let me know if they didn't and I'll try again.

Oh another thing that I noticed while retrieving the brain was that there was a bit a moisture inside the grounding plug. As I pulled the two ends apart a drop of liquid fell onto my hand. Not sure what the implications of that are but it seemed odd because everything else was dry. Well thats it for now, I'll continue to download the other files and send them over. Let me know if the files downloaded correctly.
This represented the first new data report from the station since it lost communications on October 4th. I will try to be very clear about what I have learned:
  1. The station has been completely offline since early October, 2011. This of course is a crushing disappointment for all of us.
  2. The biggest surprise is that the data stream ends on October 4th, not on October 2nd when communications were lost. There are no indications of unusual circumstances in the data record at the time when the station initially lost communications. This initial loss of communications took place at UTC Sunday, October 2nd 19:10 (in Saipan time this is at 5:10am on the morning of Monday, October 3rd).
  3. This means that the initial loss of communications is still unexplained. My guess would be either a gradually-worsening loose power connection or some kind of progressive failure of the cellular modem. Our best evidence about this failure remains Ross Timmerman's discovery of the unusually-large number of system resets by the modem (described in this previous blog posting).
  4. The station's data record ends at UTC Tuesday, October 4th 4:12 (in Saipan time this would have been at about 2:12pm on Tuesday, October 4th). Again, there is no indication whatsoever in the data record of any problems leading up to the moment of total systems failure. The station appears to have been operating perfectly for those last two days except for its loss of communications. I have gone instrument by instrument and diagnostic by diagnostic over every retrieved data point and the station appears to have been operating perfectly up to the very last minute.
  5. Thus far I have only seen the station's 1-day, 60-minute and 6-minute data tables. There may be some further details to be gleaned from the other three data tables (1-minute, 30-second and 5-second). But based on what I've seen so far, there are not likely to be any significant revelations in these other data tables.
  6. There is no possibility of data corruption (i.e., it is not possible that the station continued to operate normally after October 4th but we merely failed to recover its complete data record) because the datalogger numbers its records and none are missing.
One question remains to be investigated: when exactly did was the modem power-cycled? To my mind the most likely explanation is that the cellular modem failed for reasons unknown and then some further problem was accidentally introduced when the stationtop was opened up during that October 4th/5th power-cycling visit. Given that there is no sign of station or instrument distress up to the moment of failure, the most likely explanation is human intervention.

This, sadly enough, is the risk that we take every time that we open up the CREWS station top and work with its innards. Normally our "insurance policy" is to connect to the station via radio from the boat every time after closing up the station top. But unfortunately we did not have the time, weather, equipment and software that we needed to configure and test the radios during installation in August. This left the on-site maintenance team without their most important tool.

The takeaway from all this is that we probably have a wire (or many wires) pulled loose somewhere. We know that the datalogger probably lost power on October 4th and has never regained it. We know that the cellular modem did not appear to have power when it was reinstalled on November 28th or 29th. We also have reason to think that some of the other instruments, possibly the SIO4 serial ports or the radio, have had at least intermittent power since October 4th, given David's report of seeing lights blinking on November 28th or 29th.

So the next thing to try would be a visual examination of the "brain" unit for signs of loose wires. This may be followed either by attempts to reinstall the brain in the station or perhaps by shipping the entire "brain" package back to Miami for evaluation. I will update this blog again when we have decided on our next step.