Thursday, December 25, 2008
Wednesday, December 24, 2008
Saturday, October 18, 2008
Wednesday, October 15, 2008
Thursday, October 9, 2008
FireWall
With so many revelations it is hard not to gain more interest in the financial world, the following is the collection of audio/videos for main street from public radio
Thursday, September 4, 2008
Wednesday, September 3, 2008
Wednesday, August 20, 2008
Wednesday, July 23, 2008
Tuesday, July 15, 2008
Tuesday, July 8, 2008
Monday, July 7, 2008
Saturday, July 5, 2008
Friday, July 4, 2008
Thursday, July 3, 2008
Wednesday, July 2, 2008
Sunday, June 29, 2008
Saturday, June 28, 2008
Automate
Installation, configuration, deployment, and management of many machines using following:
SystemImager
CVSup
Subcon
SystemImager
CVSup
Subcon
Friday, June 27, 2008
Thursday, June 26, 2008
Real Time System Tracing
"premature optimization is the root of all evil(or at least most of it) in programming."
Monitoring Comparative
Cacti:
- Graphing Monitoring Data
- Data Store in RRDtool
- No reporting
- Plugin support for extension
- Highly Scalable Monitoring system
- Designed for grid and cluster computing
- XML for data representation
- XDR for compact portable data transport
- RRDtool for data storage and visualization
- Ideal for large monitoring environments
- No support for events or notifications
- Does not support thresholds
- Extension by pluggable modules similar to apache
- Gmond – Metric gathering agent installed on individual servers
- Gmetad – Metric aggregation agent installed on one or more specific task oriented servers
- Ported to various different platforms (Linux, FreeBSD, Solaris, others)
- Apache Web Frontend – Metric presentation and analysis server
- Attributes:
- Multicast – All gmond nodes are capable of listening to and reporting on the status of the entire cluster
- Failover – Gmetad has the ability to switch which cluster node it polls for metric data
- Lightweight and low overhead metric gathering and transport
- Monitor hosts, service and network
- Plugin extension
- Automatic log file rotation
- Redundant monitoring hosts
- web interface for viewing current network status, notification and problem history, log file, etc.
- Supports thresholds
- Generates events or notifications against thresholds
- Supports "active checks" and "passive checks"
- Data logging and graphing system for time series data
- Constant data storage size
- Data storage is created upfront
- Written in Perl
- Uses RRDTool
- Plugins can be written in any language
- Master/Node Architecture
- Default plugins like load average, memory usage, CPU usage and network traffic
Wednesday, June 25, 2008
Tuesday, June 24, 2008
SELinux
"God is a challenge because there is no proof of his existence and therefore the search must continue."
Graphing
"People think that computer science is the art of geniuses but the actual reality is the opposite, just many people doing things that build on each other, like a wall of mini stones."
Monday, June 23, 2008
Saturday, June 21, 2008
Friday, June 20, 2008
Thursday, June 19, 2008
Performance
"The hardest thing is to go to sleep at night, when there are so many urgent things needing to be done. A huge gap exists between what we know is possible with today's machines and what we have so far been able to finish."
Logging Services
"The most important thing in the kitchen is the waste paper basket and it needs to be centrally located."
Sunday, June 15, 2008
Unix Tools
"I define UNIX as 30 definitions of regular expressions living under one roof."
For Linux:
From GOOG:
Tuesday, June 10, 2008
Guru of the Week (gotw)
Guru of the Week is a regular series of C++ programming problems created and written by Herb Sutter. Since 1997, it has been a regular feature of the Internet newsgroup comp.lang.c++.moderated, where you can find each issue's questions and answers (and a lot of interesting discussion).
Wednesday, June 4, 2008
Tuesday, June 3, 2008
Wednesday, May 28, 2008
Scalable Nonblocking Data Structures
InfoQ has an interesting writeup of Dr. Cliff Click's work on developing highly concurrent data structures for use on the Azul hardware (which is in production with 768 cores), supporting 700+ hardware threads in Java.
Thursday, May 22, 2008
All Things Distributed
Wednesday, May 21, 2008
Sunday, February 3, 2008
Event Handling Framework
libevent provides a simple portable framework for getting the events which uses the most efficient possible system calls available on your system.Saturday, February 2, 2008
Unix Memory Model
There are some basic regions ("segments") provided by all Unix variants:- Stack: (Variable size) This is where information about function call sequences is stored. There is microprocessor support for the stack.
- Code: (Fixed size) The area of memory containing machine code instructions. Typically r+x permissions. Aka the text segment.
- Data: (Fixed size) The area of memory containing initialized data. This includes static variables, string constants, etc.
- BSS: (Variable size) The area of memory containing uninitialized data. This is where "heap allocated" objects live.
System V shared memory
System V provides an alternative mechanism for setting up shared memory, via the shmctl (), shmget (), shmat (), and shmdt () set of calls. These are not suggested for use, because- System V entities live in a seperate namespace with seperate access permissions and adminstrative tools (e.g., ipcs).
- System V entities are not automatically cleaned up if all programs using them exit, and can be a resource management nightmare.
Shared memory
Physical memory can be shared between two processes merely by manipulating their page tables. This happens automatically in modern Unixes in various circumstances, e.g.,- When implementing shared libraries, the dynamic loader will mmap the library, and the kernel will share the maps amongst processes.
- When forking, the child gets a copy of the parent's page table, i.e., their pages originally all coexist in physical memory; but the first time a page is written (by either), the kernel traps the write and makes a copy. Thus a child can share a large read-only data structure constructed by the parent prior to forking; although as caveat, in same languages, a read-only data structure is still written to by the runtime (e.g., garbage collection metadata).
- Threads share their entire page map. The OS will simply reset the stack pointer when switching contexts, as opposed to flushing the TLB.
mmap
The mmap () system call allows the programmer to associate a region of the process virtual address space with a file. It is an extremely general purpose utility:- It allows the mapped memory to have protection attributes (readable, writable, execable).
- It allows the process to have a private copy (on-write) version of the file; changes are private to the process and disappear when the process exits. Alternatively, it allows the process to share the mapping with other processes; writing the memory area is equivalent to writing the file.
- The memory mapped region need not correspond to an actual file (i.e. anonymous); by creating an anonymous mmap in a parent and forking, the children can share memory.
- The namespace for mmap corresponds to the filesystem, adhering to the "everything is a file" Unix ideal.
- Access permissions correspond to file permissions.
- The actual relationship between the virtual address space and physical memory consumed by the mmap is controlled by the OS; in particular, memory resources are automatically freed when all processes using an mmap either munmap or exit.
Associated with mmap are the system calls msync () and madvise (). msync instructs the OS to write all modified pages to disk, either synchronous (don't return until call is complete) or asynchronously (return after sync has been scheduled); the OS will also optionally asynchronously sync dirty pages to disk if the proper flag is passed to mmap. madvise provides hints to the kernel as to how the program will access the mmap, in order to optimize.
Unix Signals
Signals are an asynchronous notification mechanism. Signals are covered by a POSIX standard. Under Linux, the signal (7) man page contains the list of signals supported.POSIX.1 signals
Event Server
Servers typically handle three types of events: File descriptor, signal and timeouts1. File Descriptor Events
There are several system calls available to receive file descriptor events.
- select (2) is the most portable and least efficient.
- poll (2) is nearly as portable, less inefficient, and has a very intelligible interface.
- Linux has epoll (4) which is a vastly more efficient variant of poll.
- FreeBSD has kqueue, which is a single extensible kernel interface for all event handling.
- Finally, POSIX.4 defines asynchronous I/O (AIO).
- Portability: Maximum, Efficiency: Worst, Notification Type: readiness, level triggered
- Portability: Maximum, Efficiency: poor, Notification Type: readiness, level triggered
- Portability: solaris, Efficiency: acceptable, Notification Type: readiness, level triggered
- Portability: linux 2.4+ , Efficiency: good, Notification Type: readiness, level or edge triggered
- Portability: linux 2.6, freebsd, Efficiency: variable, Notification Type: completion
- Portability: bsd, os/x , Efficiency: good, Notification Type: completion and readiness, level or edge triggered
Using kqueue makes it easy to mix signal event and file descriptor event notification. There is an event filter for signals, interest is signaled similarly to file descriptors, and the events are delivered in same way as file descriptor events.
Besides kqueue, every other way is ugly.
If the signal is delivered during the poll system call, poll will be interrupted with return value EINTR, even if timeout is -1. Thus, you will handle the signal event with low latency. The bad news is, if a signal is delivered between the first sigprocmask call and the poll call, and the poll timeout is -1, poll will not be interrupted and the signal will not be handled until (if) a file descriptor event occurs. One way to guard against this is to have a maximum poll timeout, e.g., of 100ms, which means in exchange for the (slight) extra overhead of 10 system calls a second when idle, you will have a maximum signal latency of circa 100ms.
POSIX provides the pselect (2) system call, which is like the sequence
sigprocmask (SIG_SETMASK, &mask, &oldmask);
select (...)
sigprocmask (SIG_SETMASK, &oldmask, NULL);
except that the system call eliminates the possibility of a signal being delivered between when the sigprocmask call returns and the select call begins. It was designed for the usage outlined above with poll, and therefore sounds ideal. Unfortunately, pselect is broken under Linux. Also, it uses select, and we prefer poll.
Another alternative is to have your signal handler write to a file descriptor that is included in your poll set:
With this setup, you can have an arbitrary poll timeout and maintain low latency. However, it's important to use a pipe, so that the (typically 4 byte) write and read of the signo is atomic; under POSIX, only pipes guarantee a minimum atomic read/write size larger than 4 bytes. It's also important that the pipe be set to non-blocking (see below), to avoid deadlock.
Both techniques also apply to epoll.
3. Timeout Events
The next event type of interest is the timer, e.g., you want a 50ms timeout on a sub request. In general, you may have many more simultaneous timers pending in a complicated server than you have file descriptors, since there are generally multiple timeouts per request.
Once again, kqueues makes it easy. There is a timer filter type which is treated similarly to the file descriptor filters.
Also once again, every other way makes it ugly.
poll, epoll, etc. all provide a single timeout argument to the system call. The problem, then, is to consider the entire set of timers, determine the delta-t until the next timer goes off, and use that deltat as the timeout argument to the system call. By storing the timers in a binary tree sorted by (absolute, not relative) expiration time, the next expiring timer can be found in O (log (N))) time, and creating and removing timers can also be done in O (log (N)) time; the latter is especially important, since most timers are cancelled before expiring (since they are most often used to timeout subrequests, and most of the time, your subservers are within SLA). This technique utilizes the gettimeofday (2) system call.
Thursday, January 31, 2008
Wednesday, January 30, 2008
Tuesday, January 29, 2008
Monday, January 28, 2008
Sunday, January 27, 2008
Message Broker
Saturday, January 26, 2008
Virtualization is being applied in all aspects of software development, this DDJ Article recommends for build environment.
google "father of C++" returns Bjarne Stroustrup's Homepage . This homepage has useful resources especially technical FAQ.
Friday, January 25, 2008
Memory Leak
In Memory Cache
Commercial:
Open Source:
- Client libraries are available in C, Java, PHP, etc.
- Clients partition data across memcached servers using modular hash algorithm 'key % # partitions'.
- If an item isn't found in the cache, it's looked up in the source of truth and added to the cache. The extra cost of hitting the source of truth is amortized across all the accesses.
- Distributed Caching with Memcached
- Consistent Hashing
- FAQ: http://www.socialtext.net/memcached/index.cgi?faq
Testing Tools
Here are a few I bumped into today for Traffic generation:
SWIG
Check out more @ SWIG
Subscribe to:
Comments (Atom)