Announcement

Collapse
No announcement yet.

PUBLIC-NOTICES: Forum Changes/Fixes. Any Questions?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Re: PUBLIC-NOTICES: Forum Changes/Fixes. Any Questions?

    Forums going down for maintenance. Quite a bit to get done. All services (pics, forums, testforums, and the DB) need to be stopped for maintenance in preparation for a move to a new server that DT bought. (It should be faster... much faster.) There is no ETA on that move. Expected down-time is about 1 to 3 hours.

    Planned maintenance: Monday, March 7, 2010 at 9:00pm Pacific time for 1 to 3 hours.

    Comment


    • Originally posted by TheCotMan View Post
      Forums going down for maintenance. Quite a bit to get done. All services (pics, forums, testforums, and the DB) need to be stopped for maintenance in preparation for a move to a new server that DT bought. (It should be faster... much faster.) There is no ETA on that move. Expected down-time is about 1 to 3 hours.

      Planned maintenance: Monday, March 7, 2010 at 9:00pm Pacific time for 1 to 3 hours.
      Most of the scheduled maintenance went well.

      * I was able to upgrade the forum software
      * I was able to backup the database to disk, in preparation for moving data to a new server

      Then things took a wrong turn.

      While I was checking the results of the upgrade to make sure things were working, I started compressing the dumps from the database. After I finished my last check, I went to my shell, and my session was hung. I gave it time, but that resulted (eventually) in a disconnect. I talked to Jeff about it and found the same problem.

      Jeff visits the office at around 1:00am pacific time to investigate the trouble, and finds the server locked-up. Nothing is on screen, pressing keys does not remove console blanking, pressing the power button does not shut it down, pressing the reset button does nothing, holding power button down for many seconds does nothing... he has to yank the power cable in order to power it down.

      On next POST, nothing but a power light was present: no beep code, no evidence of POST progress, no progress at all. Even after removing all of the hardware except the motherboard, PSU and CPU, powering it up gave no hint of POST, error, or error codes, or beep codes. He tried replacing the battery on the motherboard, too, just in case that was an issue, but no help. It looks like the motherboard just died.

      The old OS was too old to recognize hardware on the new server, so he connected the raid controller to the new server, powered it up and I copied files from the old server to the new server. (All of the backups completed on the dead server. The dead server died while compressing the backups on disk. Compressed copies were damaged and incomplete, but the uncompressed copies were fine.)

      Since Tuesday at 1:00am, when not at my real job or sleeping, I've been working on re-building services on this new server from source. We are still not quite ready (as of this posting), but we are getting there.

      As I post this, I think the problems with PHP have been fixed, but I am tasting the software, looking for more troubles with PHP.

      * DB working
      * Web server working
      * PHP on web server can talk to DB
      * The problem with PHP appear to have been solved
      * I am here now: Testing of the software

      I've not yet started on pics.defcon.org or the testforums.

      After service is restored, I will need to stop service occasionally a few times over the weekend in order to refine and automate some of the changes that are needed, but were skipped in order to get the forums back online faster.

      Additionally, Jeff will want to bring down the new server a few times to make other changes.

      Oh, and that is not all... Jeff has plans to get another server to replace this server, soon, so we will be moving software again to it, once it is ready.

      This is NOT how I like the run servers. Since I've been admin, I've worked out "development spaces" so that we can test many new things to make sure they work before applying them to the production instance of things to be changed. I have been working with Jeff for a few months on the new server. We spent most of the time conducting research and making decisions on which platform options would be best to choose and which direction to take in configuring the new server. Last weekend, I managed to apply our decisions to make an instance of a DB, but I needed data. After I finished the maintenance on the (now) dead server, I was going to import the data from the production database into the dev DB, and then test the forum software, pics software, and more, on the new server. If it worked for me, then I would invite the mods to test it out. Once it appeared to be working, we would then schedule more maintenance o the production server, dump/restore the DB from the old to the new server, change network information a we shutdown the old server, and then be live on the new server with a minimal interruption.

      But no.The old server chose not to cooperate. COWARD! Fix yourself and fight me! ;-)

      Ignore the rest of this. These are comments on tests I performed before 9:00pm pacific time today.

      This post is doubling as a test post.


      Next:
      Test reply.

      Next:
      Test Edit.

      Next:
      Test merge
      Last edited by TheCotMan; March 10, 2011, 21:18. Reason: Test merge

      Comment


      • Re: PUBLIC-NOTICES: Forum Changes/Fixes. Any Questions?

        All of the services have been rebuilt again from source.

        Please post problems you find with the new system here.

        Expect scheduled outages over the next week or two, probably 1 hours or less each day. It will take a while for things to settle back to stable.

        Over the next 24 hours, I will be working to bring back pics.defcon.org, then testforum because Neil really needs to test something for the forums, then if DT can get me a new IP, a web server instance and environment for his contest.

        Constructive criticism is welcome.

        Thanks!
        -Cot
        Last edited by TheCotMan; March 10, 2011, 21:18.

        Comment


        • Re: PUBLIC-NOTICES: Forum Changes/Fixes. Any Questions?

          First issue found: email from forum for password reset, account creation/email validation, PM notification broken.

          I know the fix, and will include that in the next maintenance cycle tomorrow morning, which should only take about 10 minutes.

          What else needs to be fixed that can be checked and fixed in the next maintenance cycle?

          Comment


          • Re: PUBLIC-NOTICES: Forum Changes/Fixes. Any Questions?

            Originally posted by TheCotMan View Post
            First issue found: email from forum for password reset, account creation/email validation, PM notification broken.

            I know the fix, and will include that in the next maintenance cycle tomorrow morning, which should only take about 10 minutes.

            What else needs to be fixed that can be checked and fixed in the next maintenance cycle?
            Scheduling maintenance today at 1:00pm pacfic time. Forums will probably only be down for 10 to 15 minutes, maybe less. I'm looking to fix the mail problem.

            Comment


            • Re: PUBLIC-NOTICES: Forum Changes/Fixes. Any Questions?

              Originally posted by TheCotMan View Post
              Scheduling maintenance today at 1:00pm pacfic time. Forums will probably only be down for 10 to 15 minutes, maybe less. I'm looking to fix the mail problem.
              I enabled a fix that permitted to forum IP to talk to the mail server so forumss can send email to new users, existing users for password resets and PM / ubscribed thread notification. However, another network device observed this "new" behavior and blocked the forums from talking to the network for a while.

              I've backed-out these changed, and restarted services, so the forums won't try to send email to the mail server. To get this fixed will require someone else that has access to that device to allow for this activity.

              If you need to reset your password or create a new account while email access from the forums is down, email defconforums@gmail.com from the email address associated with the account in question. (If they don't match, I won't change the account.) I will confirm that you actually sent the email request and the email is not forged, so expect a reply asking you to confirm you made a request by replying again, including a special, unique string I will email to you.

              Thanks!
              -Cot

              Comment


              • Re: PUBLIC-NOTICES: Forum Changes/Fixes. Any Questions?

                Today starting some time around 2:15 pacific time, it seems the ISP that provides service for the forums, www.defcon.org and other defcon servers dropped service for Defcon and other locations in the area. Jeff and Neil discussed this and Neil stayed on top of it with the ISP. At around 7:55pm pacific time this same day, Neil informed me that he was notified that access had been restored.

                This has not been a good week for hardware and Internet Access a defcon. :-P~

                Thanks Jeff and Neil for getting information on this.

                If you want the latest information on this kind of stuff, it is a good idea to follow the official defcon twitter account:

                http://twitter.com/_defcon_

                Updates can be made there through other paths when the usual sources are unavailable.

                Comment


                • Re: PUBLIC-NOTICES: Forum Changes/Fixes. Any Questions?

                  An old event from a few Defcon past is returning under new management with some support from the old management.

                  DaKahuna and others will be running the [forum=563]Defcon 19 Wireless Village[/forum].

                  DaKahuna described it in this post and gave me a description for it using another source:

                  Originally posted by description
                  The <B>Defcon 19 Wireless Village</B> will be the place to be for wireless training and demonstrations Saturday, Friday and <i>maybe</i> Thursday (depending on scheduling.) Also, Friday & Saturday for two Hours each day: Technician Class training session and other events the rest of each day. On Sunday: Ham Radio Exams.
                  I checked with Nikita and she said it has been given an okay to return.

                  Comment


                  • Re: PUBLIC-NOTICES: Forum Changes/Fixes. Any Questions?

                    Sorry for the unexpected down-time of about 15 minutes between 11:10 and 11:25pm.

                    I was making changes to the local host that were risky, and setup a background reboot in case I locked myself out. This way, if my change was successful, I could cancel the shutdown, and there would be no interruption. However, if I locked myself out, then the server would reboot on schedule and I could get back in an continue working.

                    I locked myself out, so services went down, and the machine auto-rebooted according to plan, even though there was no scheduled down-time. (First time this has happened since I have been working as an admin here -- usually I don't need the safety net of a reboot.)

                    This is all work to bring pics back online as well as testforums and one more that Jeff will probably talk about at some time in the future.
                    Last edited by TheCotMan; March 14, 2011, 23:40.

                    Comment


                    • Re: PUBLIC-NOTICES: Forum Changes/Fixes. Any Questions?

                      Service has been restored to https://pics.defcon.org/ but expect unannounced maintenance cycles over the next 24 hours as I find and fix bugs or they are reported to me/us.

                      If you find issues with https://pics.defcon.org/ please let us know.

                      -Cot

                      Comment


                      • Re: PUBLIC-NOTICES: Forum Changes/Fixes. Any Questions?

                        I am scheduling downtime on pics, forums and testforums for DB maintenance.

                        This will probably happen at around noon pacific time on sunday and last at least 1 hour.

                        If all goes well, another round of maintenance for each service (one at a time) will follow.

                        Comment


                        • Re: PUBLIC-NOTICES: Forum Changes/Fixes. Any Questions?

                          Originally posted by TheCotMan View Post
                          I am scheduling downtime on pics, forums and testforums for DB maintenance.

                          This will probably happen at around noon pacific time on sunday and last at least 1 hour.

                          If all goes well, another round of maintenance for each service (one at a time) will follow.
                          DB work is complete.

                          Testforum maintenance will begin at 2:00pm and continue for a while. Once testing is complete another short round of maintenance will be done on the forums this evening. An announcement on when the production forums (these forums) will be down will appear here (and on twitter) later today.

                          Comment


                          • Re: PUBLIC-NOTICES: Forum Changes/Fixes. Any Questions?

                            Originally posted by TheCotMan View Post
                            DB work is complete.

                            Testforum maintenance will begin at 2:00pm and continue for a while. Once testing is complete another short round of maintenance will be done on the forums this evening. An announcement on when the production forums (these forums) will be down will appear here (and on twitter) later today.
                            I finished work on the testforum, and will be working on pics a 6:00pm pacific time. After work on them is complete, I will take down the forums, and perform the same maintenance on them as pics and the test forums. (No. This not the major upgrade of the forum software mentioned elsewhere.)

                            Comment


                            • Re: PUBLIC-NOTICES: Forum Changes/Fixes. Any Questions?

                              Down times for forums with this work on software only lasted 5 minutes for each server (pics and forums.)

                              Scheduled maintenance of production servers should now be complete unless problesm are found or reported.

                              Comment


                              • Re: PUBLIC-NOTICES: Forum Changes/Fixes. Any Questions?

                                Originally posted by TheCotMan View Post
                                (Email is still broken because of mail server configs handled by another admin.]) To get this fixed will require someone else that has access to that device to allow for this activity.

                                If you need to reset your password or create a new account while email access from the forums is down, email defconforums@gmail.com from the email address associated with the account in question. (If they don't match, I won't change the account.) I will confirm that you actually sent the email request and the email is not forged, so expect a reply asking you to confirm you made a request by replying again, including a special, unique string I will email to you.
                                I have completed this dialog with a few people since announcing this. I will be manually composing email messages to other people that have requested forum accounts since this hardware failure and move to a new server. Those messages should go out later today.

                                Another admin that manages the mail server has not been available on-site to make the required changes to fix email relay through that server, and there is no ETA for this. I continue to address this through a manual process, which is slower.

                                Apologies for the delays caused by this.

                                Comment

                                Working...
                                X