One of the things I've been thinking about writing about is interesting and common math problems. I'll start with some easy ones that I've had to do at work (no trade secrets, just common math in the trade). Notes: - I'll use the prefix "0x" to denote that the number is written in hex notation. E.g.'s 0x10 is 16, 0x1F is 31. - There's 8 bits in a byte (I may not get to that until later problems though). Largest packet aligned buffer: MPEG 2 transport streams have both 188 and 204 byte packets. In order to transfer a high speed transport stream, large buffers are needed. Large buffers reduce the number of interrupts per second, and make for effective DMA transfers. To effectively process an MPEG-II transport stream, buffers should be a multiple of the packet size. Splitting and joining buffers is not only "a pain" to program, but also requires otherwise unnecessary processing power. With a maximum buffer size of 0x20000 (131072) bytes, and a chosen packet size of 188, what is the largest packet aligned buffer size allowed? The answer is simply 131072/188*188 when the division is integer division (i.e. no decimal places). To do this with a calculator that only does regular division, one simply needs to remember the integer part of the division. In this case 131072/188=697..... So, then I just clear the calculator, and type 697*188 and get 131036 (0x1FFDC). ... With a maximum buffer size of 0x20000 (131072) bytes, and a chosen packet size of 204, what is the largest packet aligned buffer size allowed? The answer is 131072/204*204 where the division is integer division. The final number is then 130968 (0x1FF98). What is the minimum buffer size that is divisible by both 188 and 204? To solve this, take out the common factors, and multiply the remaining together. The factors in 188 are: 2, 2, 47. The factors in 204 are: 2, 2, 3, 17. So the answer is 2*2*47*3*17=9588 (0x2574), or 188*204/2/2=9588 (0x2574). With a maximum buffer size of 0x20000 (131072) bytes, and a packet size that can only be 188 or 204, what is the largest packet aligned buffer allowed? Using the knowledge from above that 9588 bytes is the smallest multiple of both 188 and 204, it's simply 131072/9588*9588 (again where the division is integer division). So the answer is 124644 (0x1E6E4). Other numbers of interest: - 196=188+8(64 bits is 8 bytes) for a 188 byte packet with a 64 bit timestamp. - 212=204+8 for a 204 byte packet with a 64 bit timestamp. - 512 bytes is a common write size divisor for hard drives. - 192=188+4(32 bits is 4 bytes) for the MPEG stride (MPEG 2 transport stride?) format (HDV). - 4 bytes (32 bits) seems to be a preferred DMA number (the PCI bus is always at least 32 bits wide). - 8 bytes (64 bits) might be preferred for certain DMA. I'll probably write myself up a quick reference. When programming, you can have the program calculate the right numbers given any buffer size, packet size or other factor. Originally from: http://www.boxheap.net/ddaniels/notes/20050820.txt
August 20, 2005
20050820
Comments Off on 20050820
August 17, 2005
20050817
So today I got a new USB key. It's a Kingston Data Traveller. I decided that I should set it up so that I can use it with the laptop I use on a regular basis (a very old Toshiba Satellite 310CDS that's rattling). My first step was to consider the file system format. I was surprised to find that I couldn't format the device in NTFS format. I left it formated at the default FAT32 and decided to get on to other things. At home tonight I spent some time downloading the drivers for the device and attempted to install them. The driver installer is an installshield created one that's been winziped into a self extracting file (sometimes called an sfx). Three layers of compression and installer junk managed to make the under 49KiB of driver files take over 1MB. The second frustrating thing I ran into is the installer is designed to detect the operating system, and refuse to install if it doesn't think it'll work. Well, I guess before that I had read a faq from Kingston saying that Windows 95 doesn't support USB drives at all. Before I get ahead of myself again, I'll go back and say that I tried to do some research on what Windows 95 supports in the way of USB drives. I didn't manage to find much, but I did find the usual indications that earlier versions of Win95 didn't have any USB support, or that it wasn't working. I already knew that USB support worked on my computer. I guess I should have seen things coming ahead of time. In December I had looked at trying to get pictures off a Fijitsu FinePix digital camera. It too had an annoying installer, and claimed not to work with Win 95. It further had a bunch of software bundled with it's installation that I still haven't bothered to figure out. Luckily, the installer for the driver itself wasn't hard to find, and I managed to get the device driver, and some of the software installed. Despite getting things installed for the FinPix camera, the software complained about a missing dll function, and the driver didn't seem to be working. I decided that with my many licences of Windows, I should try to upgrade certain dll's with versions from newer versions of Microsoft Windows. My results were of course that some of the important dll's could not be replaced. That got me thinking again about getting open source replacements for certain components. I looked for a while, and decided that without a better understanding, I might end up accidentally installing a dll that needs an Linux shared library (.so) or something. My following of the Wine Weekly News (WWN) on http://www.winehq.com and reading the ReactOS developers/kernel mailing list indicated that some dll's from these projects were defiantly dependent on components that I'm not ready to replace. So more recently (getting back to the USB key), I did another search on the subject of replacing Microsoft Windows 95 dll's with OpenSource compatible versions. I'm also now considering replacing the kernel and other core files. I did find that WWN shows that they've been building PE versions of their dll's for Win32, but it's not clear which can replace the dll's in Windows 95. I get the impression that files from ReactOS might be a better replacement than Wine's as they'll have less Linux, BSD, Solaris related stuff in them and be created with binary compatibility in mind for even more core pieces (e.g. no required wineserver). To date I've had no luck with either the USB key or digital camera under Windows 95. I've decided that in order to start replacing Win95 on this notebook, I'd better get a better understanding of the dependencies and compatibilities of different components. To do this I'd like to get or create a list of files, a graph (tree?) of the dependencies between files, and a fresh compatibility status of the files from whatever source I choose. Unfortunately ReactOS's compatibility page doesn't jump out at me in searches (I remember seeing it once or twice). I also believe both ReactOS and Wine don't list their compatibility in relation to Windows 95, but to whatever the latest version of the component is. So the processes I'll probably want to take will start with listing the operating system files on the computer I'm targeting. Then I'll probably use something like dependency walker (depends.exe from systeminternals?) to figure out the dependencies of each files (as best I can). Then I'll look at the compatibility status on the web. Last I may have to look at the exports from both files. Since no one else seems to have published this information, I'll probably write up my findings as I go. I might even make it easier to install Open Source replacement components for other versions of Windows by performing the same process using fresh installs of other versions. It's getting late now and I'm getting tired. I was planning to also write about how to use unshield and winzip to extract files from annoying installers. I also felt the need several times to explain why I wanted open source replacement files, and didn't upgrade Windows (remember I do have licences to newer versions). I guess I can quickly say that I like having free access to the source of what I'm using so that I or just about any other programmer can enhance/fix it. I also don't want to install Windows98 or later on this laptop because it may take more system resources, not run, and well I'd rather maximize the use of my Windows 95 licences before using other ones. I've tried ReactOS and Wine, and I know they're still not 100% replacements for Windows (although extremely close nowadays). I also believe that other people share my viewpoints and/or situations. Maybe later this week I'll write more on the topic of replacing windows components or Windows Device Drivers (wdm, ndis, inf, the wonderful dpinst.exe and more), but for now it's time for me to get some sleep... Originally from: http://www.boxheap.net/ddaniels/notes/20050817.txt
Comments Off on 20050817
August 11, 2005
20050811
I was planning on writing about the problems I faced at work looking up open source software for SMPTE 125M convertion. I kept finding SMPTE timecode stuff (for MIDI), and other usages of the acronym SMPTE without reference to which standard was being used. The one's related to SMPTE 125M are SMPTE 292M (HD-SDI), SMPTE 259M (transport of SDI and SDTI), SMPTE 305M (sometimes called SMPTE 305.2M which is SDTI), and the document on ancellary data. Actually SDTI really is quite different from SDI except that it goes over 259M. Anways, tonight I think I write a bit about linking and google. Yes, part of the reason that I'm writing these notes is to increase the ranking that I'll get for topics that I'd like employers to see. The bigger way that I plan to get a good ranking is something I've accidentaly found before. I've put a one line signature in my e-mails to mailing lists with my resume's URL. I was hoping I could find someone on the mailing list that might be interested, or might refer me to someone, but instead I found that the html mailing list archives looked to be increasing the rank of my resume. I guess this was a neat trick that can work on google, and maybe on other search engine's that look at what's linking to a page to give it a score. When I finnaly am happy with the testing scripts that I'm working on for my tarball enhancements I'll post the results to various mailing lists that are development forums for projects with large tarballs (e.g. the lkml, some kind of gimp mailing list, maybe some OpenOffice.org AKA OOo mailing lists...). I've got my resume's URL in the scripts themselves, but I also plan to put my resume URL tagline in my messages. One of my problems with my tarball enhancement postings is that I'll want a perminate place with my domain name that I can host the scripts, but I'm getting free hosting from a friend (thanks Dean). I don't want to generate a lot of hits on my friend's server due to the fact he likely has better uses for his bandwith, and his ISP may not apreciate it. To prevent such a load on the link to his server (and his server), I plan to keep the scripts only on the mailing lists (archived in their archives) until interest drops down a bit. I figure a few weeks would do, but I'll probably wait a few months. I'm really quite kean to get my scripts out the door, but I feel they're not yet ready to stand up to the kind of critism that one gets on the Linux Kernel Mailing List (lkml). I've got a script to do the actual tarball creation, and one to show the difference between a normaly generated one, and the one my script makes, but I don't have something showing the amount of time that it takes. Measuring the sorting isn't easy, as it's a series of piped commands. My shell scripting really isn't put to enough use for me to be able to quickly work around such a problem. I've checked a few howto's like the bash one, I've asked in the bash scripting IRC channel, but I couldn't find an answer. I decided to put the commands into a separate script and time that whole script. The other problem I've run into is testing. My home computer was taking a beating compressing and untaring etc.. I decided to use my SourceForge compile farm shell to do the testing, but it's a pain to put files onto them. It took me a while before I figured out I had to download the files to my computer, and then upload them to the compile farm's central server via sftp or scp. That's something I can do, but it really compounds another problem I'm having. It takes me a while to make progress on my free time coding projects, so new target files are comming out for me to test. I want to be able to post on the lkml the results of recompressing the latest 2.6 and 2.4 kernels. I keep optimistically downloading the latest kernels and then having real life interupt things long enough for me to need a new version to continue. I'll stop doing that for a while though until I've actually got a draft sitting in my posponed box of an e-mail to the lkml with the scripts already finnished and attached or actually inline I think. That's another problem. The lkml only accepts certain posts, and Linus only usually accepts things that are in a certain format (plain text inline iirc). That put me on a tangent of looking up the mailing list rules, and reading the Linux Weekly News. It'll likely do the same once I get close enough again. So with all my knowledge, reading, and interest in digging deep into open source stories that I see writen/posted, I've thought about trying to get payed to write. These notes are a bad example of my ability to write, but a good example of what I enjoy writing about. I've been solicited once to write a book on Intrusion Detection from a genuine publisher, but I kind of "fubbed" my responce. I said that I'd be interested in contributing, but I didn't think I'd have time to write a whole book. I kind of regret doing that, but I think it was the right thing to say (just look at my bad record finding time to do coding). I'm hoping however that a paying gig would actually let me take some time away from real life to actually get things done (and I'm sure it would). Of course I've got to stike a balance to keep my home life happy and healthy (fammily, friends, and my own condition). I've offered to write a peice on the history of the BSD's to the Linux Weekly News, but they didn't seem interested. They do post BSD articles, and I was pitching that I could write one that would show the parallels between AT&T vs The Regents of Berkly (BSD) and the current SCO vs IBM etc.. It's interesting how the history repeats itself. For good reference I'd suggest reading the FreeBSD mailing list archives (a google search found some good stuff). Later I might publish the research that I used as part of my pitch for my BSD history repeats itself story. I'm also probably going to consider writng about why I don't want to publish my unrealized ideas. I'll also probably talk about: - Why I don't write about office politics - Why I don't write much about my personal private home life (well, maybe I made that clear <g>) - My music idea's - My thoughts and research into a self powered home (well actually getting power form alternate sorces like sun, wind, water...) - Thoughts on using "image stacking" for ameture (and hopefully professional) astronomy (I'll talk about this because other people have already implemented some of this) - Some idea's for how people can generate data that's easier to compress (e.g.'s typing in lower case when there's the option, removing obvious redundant information, using the same words...) - Perhaps my ideas on natural language processing ... I may eventually post my project ideas from the last fourteen years that I've been writing on paper. Consider sending me money! My resume is at http://www.boxheap.net/ddaniels/resume.html Oh, and I'll probably write about resume creation and open source tools to do it (hey, maybe lwn.net would be interested in buying that article). Originally from: http://www.boxheap.net/ddaniels/notes/20050811.txt
Comments Off on 20050811
August 10, 2005
20050810
There doesn't appear to be any adopted standards for MPEG over IP. IP over MPEG looks more interesting. Just packetize an IP stream into a packetized elementary stream (PES) and multiplex it into a valid MPEG2 Transport Stream. MPEG-2 typically gets transfered over DVB-ASI, DVB-C, DVB-SI, DVB-T and other protocols (even "ATSC" AKA SMPTE 310M). So how do you packetize IP packets to go into an MPEG stream? Well that depends on the source. I'd like to think that any IP source "worth it's salt", is from a live network. Thus a network feed would need to input into the packetizer, multiplex it and put it out over a different type of device. I've heard of some people making a network device driver for DVB-ASI cards, but at least one engineer I talked to said there's probably a better way. He suggested keeping the regular characteristics of the ASI device, and doing the packetizing in application space. I managed to convince him however that the conveniences of creating a network device which can be bridged would be far better. He stuck with the separate device driver idea however and suggested one driver could use the other. So then the question is, how do you create a network device driver that's just a packetizer, multiplexer and forwarder? No doubt there's some good examples out there, and NDIS should make it easier. I still worry about doing more than elementary processing in a driver might cause some strange system behavior. I guess I should also say there's probably an even easier way to do things in Linux and FreeBSD variants, but I'm mostly focused on the Microsoft world as that's what I'm told by Marketing is what's wanted. On the opposite end you need a depacketizer? Or something to demultiplex the stream, and put IP back out onto the network. I've seen this done in software, and that might make more sense on this side of transfers. The engineer that I speak of above however suggested that the unidirectional nature of MPEG II transport streams would give another problem, associating one direction of traffic with the other. I'm not quite sure how other people bind one transfer direction with another, but I remember several satellite companies offering service that beamed high speed broadband internet access to customers and accepted data back to them via telephone modem. So schemes to put two different directions from seperate devices have been around for a while. I just hope that modern network stacks are smart enough to remember that it's allowed. I remember someone telling me that the ARPA network was an experiment designed with the goal that it be able to stay up, even if one link in the network went down. It failed, or at least that's the punchline. The modern Internet can't reroute if there's a failure in a router. There was a fire in a telecom building in Toronto, and connections from Manitoba Telephone Services (MTS) to Shaw in Winnipeg went down. I've also seen where an outage in Shaw's network caused places to become inaccessable, but if you had a proxy on a CA network accessable address, you could access the rest of the internet. Those are just two local examples that I know about. The CA network thing is political (I'm told they're not allowed to carry commercial data due to their funding grants). For what it's worth, I've also seen many shares of misconfigured routers, the more obvious cases were with major telecom. companies. So back to MPEG transport streams. I know companies like Norsat have been selling "solutions" to do these things for years, so I think there's a market. Identifying the market potential is difficult for this because it's not something most broadcasters, stations, and local distributors are looking for. It's also not something that's really even remotely accessable to consumers. A similar issue that I've thought about for even more years is multiple links between computers to increase throughput. I know lots of other people have looked at bridging and bonding, but I wanted to look at it at an even more insane level, serial ports. Actually I wanted to look at paralell ports, modems, ethernet, etc.. I suppose it is possible to bond all these links together, but it certainly isn't common enough that it's as easy as listing the links (at least as far as I know). So why bother with all this legacy stuff? Why not build a new network card that can communicate at the full buss speed? Well actually we're pretty close to that now. From my own experiences I've calculated that modern HD-SDI cards must be close to maxing out the bus throughput. I've also learned that multiple cards on the same bus can't allow a faster network connection as of course there's only the one bus. Of course I've seen computers with multiple bus's, but it's hard to know if they're truly independent, or if they're more likely bridged. Even a bridged network of bus's can allow each bus to operate almost independently, if the parent bus is faster than it's children combined there may be an advantage to using multiple cards. Even further it's important to note that most modern bus's have bottlenecks. When was the last time you looked up the DMA latency of the motherboard you wanted to buy? I'll wager never. I've thought about how useful this could be to consumers and whether there was a way I could get the comany that I'm working for to publish regular results of DMA throughput and latency. We could then get free motherboards. The idea likely wouldn't work though as that's not really what the business does. Other crazy idea's I've had include using every processor on the system to do computations including the IDE/ATA hard drives (they have RAM too!). Alas of course most of it would be very convoluted to figure out a way to use. The more recent idea that I've had (since Mark Nelson's "random" binary file challange was posted), was to figure out a list of common instructions and library calls for which I could get more output that would be required to issue the request for data (e.g. more bits from register results than from instruction cost...). This idea has some potential, but my current "hurdle" is finding the time to get and go through a list of CPU instructions. Getting a list of available function calls is also a challange although maybe a nice program to do it already exists (just get the exports from all dll's etc?). So as you might be beginning to see, one of my primary interests is data compression. I used to be interested in the pure pattern finding, and making the smallest representation possible of common data, but then I started working in multimedia. It became obvious very fast that the speed of compression actually is important (not just to those who can't wait). If things don't compress fast enough you can get overruns, data loss and ultimatly data corruption (even if that just means missing bits/bytes/frames...). One of my past pet projects was zlib compression of SDI (see StreamBed's deflate option). I've found that I can gzip at 270,000,000 bps (that's 270Mbps in SI notation). The small b means bits of course. The problem is that it ether needs a fast processor, or a simple pattern (like colour bars). It may even need both. I haven't had the time to check. Unfortunaly without the inflate option in StreamBed, customers aren't too interested yet, and without customer interest my boss isn't too interested yet either. I later plan to talk about:
- compression of MPEG transport streams (they're already MPEG compressed, but the tables are text, and there's that predictable 0x47 once per packet or 188/204 bytes).
- lossy MPEG table compression (recreate it to meet spec)
- Alternate SDI compression
- Compression in firmware
- My literly magic tarball scritps to improve general tar.gz and tar.bz2 compression with an order(1)+small sort preprocessor
- My extensive prediction by partial match (PPM) research and documentation (I've spent at least the last 5 years working on a new algorithm that I have high hopes for, but strange results)
- My ideas (hopefully implemented soon) for a very simple honeypot like intrusion detection system
- problems with current network security tools
- Perhaps products and development directly related to my jobs
Comments Off on 20050810