| View previous topic :: View next topic |
| Author |
Message |
Paul
CastleCops Founder
 Joined: Feb 22, 2002 Posts: 27351
|
Posted: Fri Feb 22, 2008 3:10 pm Post subject: Exec Sum 22 Feb 2008 |
|
|
Greetings folks, as of the first executive summary posting yesterday:
/t216007-Executive_summary_21_Feb_2008.html
I was able to determine some strange error message from the 3ware 9000 series controller card via tw_cli (a portion will be displayed here):
tw_cli /c0 show diag
| Code: | Error, Unit 64: Logical unit not present
(EC:0x10a, SK=0x05, ASC=0x25, ASCQ=0x00, SEV=01, Type=0x70)
unit=64 |
Last night working with Paul Vixie, I migrated MySQL to another server running FreeBSD 7 RC2 and MySQL 5.0.x. Its also running on a very cool filesystem that seems right now to be supporting fast access.
The outage last night was long. I started a thread here:
http://de.castlecops.com/forum/t2241-feb-2122-wwwcastlecopscom-outage.html
Third time to load up the sql dataset worked. Had to work out some bugs first:
http://wiki.freebsd.org/ZFSTuningGuide
So the server running www.castlecops.com seems to have issues with our controller card. And it appears we'll be getting it replaced by the original anonymous server sponsor. Thank you to anonymous.
In the meantime, mysqld will continue to live on this other system. www.castlecops.com server is still having load issues that have gone up to 25 and been as low as 1. So we're not out of the hot water yet. But we're running better, more consistent, even with all the IRT scripts on.
As to the strange diag error above, I found one good reference to it on the net, and both live.com and yahoo.com showed me the link, but not google.com:
http://www.dslreports.com/forum/r18385716-FreeBSD-Disk-read-corruption-issues-on-server
I'd like to also address one concern found here with respect to my desire in keeping the forums and the site running:
/p1059146-Optimizing_Performance.html#1059146
My family is growing, and I recently turned down a full time job with benefits. Why? Because the terms required CastleCops to close, or cease my involvement with this community.
I believe in CastleCops, and I know everyone here does too.
As for the term I searched on, it was:
| Code: | | 3ware "Logical unit not present" |
This is all I know right now.
Thank you out to everyone for caring and helping.
History: Optimizing Performance original thread: /t215785-Optimizing_Performance.html (ten pages)
|
|
| Back to top |
|
 |
Paul
CastleCops Founder
 Joined: Feb 22, 2002 Posts: 27351
|
Posted: Fri Feb 22, 2008 3:37 pm Post subject: |
|
|
Addendum:
Please note, I've not been responding to lots of emails, PMs, phone calls, etc due to the perf issues. I'll attempt to start getting back to all those items. _________________ Paul Laudanski - http://www.laudanski.com
http://www.linkedin.com/pub/1/49a/17b
|
|
| Back to top |
|
 |
Trpm
Security Expert Premium Member
 Joined: Jan 16, 2004 Posts: 1663
|
|
| Back to top |
|
 |
Bill_Bright
General
 Premium Member
 Joined: Jan 16, 2004 Posts: 9046 Location: Nebraska, USA
|
Posted: Fri Feb 22, 2008 4:54 pm Post subject: |
|
|
Thanks for the update Paul.
| Quote: | I'd like to also address one concern found here with respect to my desire in keeping the forums and the site running:
My family is growing, and I recently turned down a full time job with benefits. Why? Because the terms required CastleCops to close, or cease my involvement with this community. | I appreciate you sharing that. I suspect it was general knowledge to staff, but not to most regular members.
If the openness and candor from site admin over the last few days continues, and the performance of the site continues as it is today, the dedication of site admin to keep the forums up and running cannot come under question again.
Thanks, once again, for addressing these performance issues - obviously, with the site feeling alive again, performance issues minimized, and the free flow of information/lessons learned over just the last couple days, this is a Win/Win all around, IMO. _________________
Bill, AFE7Ret
Freedom is NOT Free!
|
|
| Back to top |
|
 |
Paul
CastleCops Founder
 Joined: Feb 22, 2002 Posts: 27351
|
|
| Back to top |
|
 |
Deacon10
1st Responder Premium Member
 Joined: Aug 27, 2007 Posts: 881 Location: Florida
|
Posted: Fri Feb 22, 2008 5:51 pm Post subject: |
|
|
Hello Paul, I in no way understand the problems with the board but whatever you did last night sure made a heck of a difference. Thanks for your efforts, it is appriciated... _________________ Deacon10
"Hindsight explains the injury that foresight would have prevented”
|
|
| Back to top |
|
 |
AlphaCentauri
SIRT Handler Premium Member
 Joined: Nov 20, 2003 Posts: 2895
|
Posted: Fri Feb 22, 2008 5:54 pm Post subject: |
|
|
| Quote: | | In total there are 1266 users online :: 50 Registered, 2 Hidden and 1216 Guests |
Looks like people couldn't wait for the site to be available again. (The breakdown is approx. 33% HijackThis forum, 6% forum index, and the rest pretty widely dispersed.)
|
|
| Back to top |
|
 |
Bill_Bright
General
 Premium Member
 Joined: Jan 16, 2004 Posts: 9046 Location: Nebraska, USA
|
|
| Back to top |
|
 |
Paul
CastleCops Founder
 Joined: Feb 22, 2002 Posts: 27351
|
|
| Back to top |
|
 |
Mad_Dog
Private

 Joined: Mar 23, 2007 Posts: 44
|
Posted: Fri Feb 22, 2008 6:25 pm Post subject: |
|
|
The site is better. I am able to work the phish links faster.
It appears to me a logical unit of the RAID failed or is off-line. Probably LUN 64. The error code (EC) looks like something thrown from the SCSI bus.
It has been a while since I rebuilt a failed drive on a RAID at that level.
Mad_Dog
|
|
| Back to top |
|
 |
Bill_Bright
General
 Premium Member
 Joined: Jan 16, 2004 Posts: 9046 Location: Nebraska, USA
|
|
| Back to top |
|
 |
Coldmoon
Returnil Premium Member
 Joined: Sep 30, 2006 Posts: 198 Location: USA
|
Posted: Fri Feb 22, 2008 6:35 pm Post subject: |
|
|
Hi Paul,
You can ignore my PM and kudoes on the fine work. The site is loading nicely now. Hardly any delay to make note of so your efforts have had some very visible success.
Mike 
|
|
| Back to top |
|
 |
Blair
Microsoft MVP
Joined: Feb 17, 2004 Posts: 26 Location: USA
|
Posted: Fri Feb 22, 2008 9:10 pm Post subject: |
|
|
| Paul wrote: | That was the first mention of it.
Also, we learned yesterday that the main box www.castlecops.com is on, is only a single proc single core. We had believe it was a dual proc single core forever, but that is incorrect. |
Oh wow! Well that certainly explains a lot. There's no way a single core CPU can keep up with the demands of this site. Good luck with getting some new hardware. Let me know if I can help.
|
|
| Back to top |
|
 |
Mad_Dog
Private

 Joined: Mar 23, 2007 Posts: 44
|
Posted: Fri Feb 22, 2008 10:11 pm Post subject: |
|
|
| Bill_Bright wrote: | | Quote: | | It has been a while since I rebuilt a failed drive on a RAID at that level. | I try to suppress memories of traumatic events. |
MD
|
|
| Back to top |
|
 |
Paul
CastleCops Founder
 Joined: Feb 22, 2002 Posts: 27351
|
Posted: Fri Feb 22, 2008 10:18 pm Post subject: |
|
|
Thanks Blair and all. Lots of great help in this community. Next step is to get the controller card replaced hopefully next week, and then we'll see what the next steps are. At least we've returned to some kind of normalcy. And hopefully with all the troubleshooting that took place publicly, it'll help others. _________________ Paul Laudanski - http://www.laudanski.com
http://www.linkedin.com/pub/1/49a/17b
|
|
| Back to top |
|
 |
|
|