Announcement

Collapse
No announcement yet.

Technical aspects of seeding torrents (relocated)

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Technical aspects of seeding torrents (relocated)

    Originally posted by Siviak
    here's the skinny... as to save poor old Bink on some bandwith he should create the torrent and share it with one or two people... then they share it with one or two.. after that you can open it up for everyone...
    This seems like the way to handle it with more traditional methods (i.e., non peer-to-peer protocols), and it won't really help with his bandwidth with a Torrent. As soon as he opens it up, it is likely that his seed will send out data as fast as his defined limit will allow. He might as well just post it in the public and make sure those one or two people join early on ... the effect will be the same, no?

  • #2
    As soon as he opens it up, it is likely that his seed will send out data as fast as his defined limit will allow.
    This is why it is important to preseed files that you know will have a high demand. I would think having 4 or 5 seeds ready to go when the torrent is posted should be enough. From there, the pressure on these seeds only goes down as time goes on.

    It's highly effective. The best example was probably with the newest release of Ubuntu. It was usually seeded with about 6 people, with slow download speed (6 seeds with over 1000 peers) but within a few hours the overall speed increased and a late arriver would be able to download the file in less than one hour.
    jur1st, esq.

    Comment


    • #3
      Originally posted by jur1st
      This is why it is important to preseed files that you know will have a high demand.
      I'm sorry if this is getting dangerously off-topic (any help from a moderator?), but I'm really interested in this topic.

      Maybe I'm being dense, but how is the situation any different (assuming, of course, that you preseed over the same network as you normally seed)? BitTorrent attempts to spread as many copies across the network as fast as it possibly can. If you start with one seed, 5 soon-to-be seeders, and a bunch of people downloading, how are you any worse? It would seem that you would be better off because the fastest contributors (whether they intend to be long-term seeders or not) will get the files first and share the most, no? Even in your pre-seeding example, the download started slow and got faster as more people joined. A single person being able to provide a file is a large part of why BitTorrent became so popular with the copyright-infringement crowd.

      The only "down" side is that the connection will (for a short amount of time) be limited by the originator of the content. In other words, very early joiners are going to be crawling along. While this might be undesireable in the short term, it helps the long-term view. If we consider it another way, though, the period of slowness is when the pre-seeders would have been loaded, though (i.e., a slow connection versus no connection).

      Comment


      • #4
        Originally posted by Voltage Spike
        I'm sorry if this is getting dangerously off-topic (any help from a moderator?)
        Done. Cropped and moved to "Got Questions"-- although, doesn't this defeat the purpose of the forums? -- To derail as many threads as possible? ]:>

        Comment


        • #5
          Originally posted by TheCotMan
          To derail as many threads as possible?
          That's okay, I'll leave the derailing to you.

          (Wait. Did I just post on off-topic reply in "my" own thread? )

          Comment


          • #6
            Exactly, the real advantage of preseeding is for speed (though i'm not an experienced seeder) for the community, and probably in the long run a lower amount of total bandwidth for the original seeder. We just have to rearange your previous comment...

            "it helps the long-term view, the period of slowness is when the pre-seeders [load], and then the download [gets] faster as more people join"

            This way. the download time for most of the people downloading would be shorter from the very beginning.

            I'd sit here and figure out the math about how many gigs would be transfered per peer, but I think that we can agree that at the very minimum that preseeding would be a wash against starting with a single seed.
            jur1st, esq.

            Comment


            • #7
              Originally posted by jur1st
              Exactly, the real advantage of preseeding is for speed (though i'm not an experienced seeder) for the community, and probably in the long run a lower amount of total bandwidth for the original seeder.
              I tend to think of it as a long period of 0 KiB/s (while pre-seeding) followed by a possibly higher download after the torrent goes public. I'm not so sure the effect would even be noticeable, though. The peers don't have to wait for the entire download, and the designated seeders begin contributing to the download after receiving their first fragment.

              The flaw, of course, is that the torrent creator will send some pieces to slow contributors or unreachable clients. (I believe the seeders are equally generous to all downloaders because they can't make a client-side decision about the peer's generosity, but the tracker, assuming one exists, might actually take this into consideration. I'll look into this.) If I upload to a user with an asymmetric connection, then we might consider that upload to be a detriment to the overall bandwidth.

              The flaw in the flaw is that I'm inclined to believe Bram Cohen accounted for this in some manner (e.g., through the aforementioned tracker). Unfortunately, clients don't trust each other to prevent people from cheating by fudging their contribution numbers (which was/is not uncommon in other peer-to-peer networks).

              Originally posted by jur1st
              This way. the download time for most of the people downloading would be shorter from the very beginning.
              Maybe, but I am also being lazy about trying to figure out the math behind this.

              I believe the difficult part is in modeling how peers join a download. We can probably assume that not everyone joins at the same time, and, after some insignificant amount of time, the designated seeders will also be contributing at their maximum rate so that newer downloaders can catch up to older downloaders. It's like some sort of file-transfer Ponzi scheme! Cash out while you can!

              Originally posted by jur1st
              I'd sit here and figure out the math about how many gigs would be transfered per peer, but I think that we can agree that at the very minimum that preseeding would be a wash against starting with a single seed.
              This is likely the case, since, whether we pre-seed or not, it is probably safe to assume that the originator will be pegged at his maximum upload rate. The question is really one of overall bandwidth, then.


              (Of course, this discussion is moot once Bascule releases PDTP, and I'm sure he'll come along to join in blasting BitTorrent at some point. )
              Last edited by Voltage Spike; August 9, 2005, 22:25.

              Comment


              • #8
                I think the general idea of a successful pre-seed has one primary component that makes it worthwhile.

                TheCotMan creates a torrent of 1337 donkey pr0n. If he opens it up wild, it will function as any torrent should, filling the bandwidth as much as it possibly can. Unfortunately (in this scenario I build), TheCotMan refuses to give up his 56k connection.

                A` is pimping at home with his T1. A` can't wait to spread the donkey pr0n to all his friends as fast as possible. On the other side of Russia, a train travelling 65 miles per hour downhill is about to hit a panda 4.5 miles away. ... I think you might get where I'm heading.

                It would make more sense for TheCotMan to initially use his full bandwidth to get the file to A`, since no matter how fast the torrent works, the original copy can only spread at 56k. If you limit the seed to one (hopefully the high bandwidth cowboy A`), then the torrent will spread at a full 56k to A`, who will in turn be accepting connections from others to spread the chunks as they are being retrieved from the seed. OTOH, you will run into several theoretical slowdowns if the 56k is being used by 4 clients (even if all chunks pulled are unique).

                First, you are increasing the overhead of TheCotMans connection to maintain four connections out his pipe instead of one. Ok.. so this is probably minimal. Second, you are increasing the likelyhood that one or more of the clients will error in downloading and have to ditch the chunk. Third and probably most prominent, the likelyhood that all 4 clients will accept outside connections is probably slim. I would have thought otherwise, but the 21 leeching peers on the torrent taught me otherwise. True, that the seed will adjust down their download because their upload status blows, but they will still be taking a chunk of the 56k without distributing it to others. In the ideal scenario, A` would have that 56k filled to his box while other peers leeched the chunks off of his T1 instead of the 56k.

                The key component is a low bandwidth home with a high bandwidth destination. If the initial seed is greater than or equal to destination(s) with a decent handful of open peers available, then pre-seed is probably nearly pointless.
                Last edited by converge; August 10, 2005, 12:34. Reason: guinnea pig took over for a second
                if it gets me nowhere, I'll go there proud; and I'm gonna go there free.

                Comment


                • #9
                  Originally posted by converge
                  TheCotMan creates a torrent of 1337 donkey pr0n.
                  No way! I would not share my donkey pr0n!
                  I mean, No way I don't have any donkey pr0n!

                  Ignoring issues of lost chunks, if the initial seeder is a "super seeder" to clients that are all "connectable", then the issue of session overhead is purely OS and hardware bound.

                  The issue of packet header vs. payload for multiple connections would likely not be an issue, because almost all packets outgoing will be max-packet to max size, decreasing the issue of header vs. payload efficiency, and if super seeding is enabled and working, and we still assume no lost chunks, and all initial clients are sharing with each other, and all initial superseeding clients stay connected until they are done and become seeders, then no chunk should ever be sent more than once during the pre-seeding by the original source, and the cost of OS latency per session could easily be so many orders of magnitude less than the total time for sending to just one peer.

                  Where do costs creep into the picture?
                  c0nverge mentioned the issue of lost chunks, but we also have issues found in the assumptions listed in the above example.

                  The largest advantage of pre-seeding with a superseeding client (IMO) is one of planning, release date/time and speed when you know the preeseeding clients will be connectable and sharing.

                  When pre-seeding is done, the seeding peers offer an aggregate throughput to connecting hosts. Connecting hosts.

                  If there is an assumption that most (if not all) hosts are "not connectable" then pre-seeding to all connectable hosts will indeed be faster than not pre-seeding.

                  If all hosts are "connectcable" and sharing as quickly as they take, the pre-seeding is not as effective, and would not have been much better than not pre-seeding. Why? non-connectables drain the UL rate from the initial seeder while "connectable" peers can share to others, suffer.

                  Hosts using ADSL frequently have faster DL rates than UL rates. Also, many users (connectable or not) choose to scale back the rate they share out.

                  The weak-links in any torrent are those people who are connectable. 16 leechers, each having unique 1/16ths of a full seed, who are all "non-connectable" (through firewall/filter/masquerating/NAT/*) can do nothing to share with each other.

                  Enter a single leecher who is at 0% but connectable and will eventually share, and now each of the othe 16 peers will be able to connect with the only connectable client and donate what they have. As each donates a chunk, the start noticing the onlt connectable client has chunks the do not, and they each start to share through the only connectable client. However, if that connectable client disconnects after they are done, there is still risk for none of the 16 "non-connectable clients" to be at 100%.
                  If we make some assumptions about everyone having the same throughput for UL but unlimited throughput for DL, then we could expect each of the 16 "non-connectable" peers to have very little progress when the only "connectable" client reaches 100%.

                  Simple numbers for upload? Assume each can upload at 64kB/sec., but have unlimited download.
                  Each "non-connectable" shares to the connectable at 64kB/sec.
                  The only "connectable shares to all the "non-connectable" at 64kB/sec divided by 16 clients or 4kB/sec per client.
                  This means the "connectable) in this case runs at 16 times faster than its "non connectable" peers. Under ideal conditions, when the "connectable" reaches 100% (assuming what is given) the non-connectable peers have moved from 1/16 complete to 1/8 complete.

                  I could write more, but I'll wait for other posts on this subject. (If I put my whole response in one reply, nobody would read it... except you... the person that made it to the end ;-)
                  Last edited by TheCotMan; August 10, 2005, 14:11.

                  Comment


                  • #10
                    Originally posted by TheCotMan
                    Simple numbers for upload? Assume each can upload at 64kB/sec., but have unlimited download.
                    Each "non-connectable" shares to the connectable at 64kB/sec.
                    The only "connectable shares to all the "non-connectable" at 64kB/sec divided by 16 clients or 4kB/sec per client.
                    This means the "connectable) in this case runs at 16 times faster than its "non connectable" peers.
                    Not quite. "connectable" should be able to pull in at 16*64 KiB/s (1024 KiB/s), but the "non-connectables" are still at 4 KiB/s. In other words, 256 times faster.

                    Originally posted by TheCotMan
                    Under ideal conditions, when the "connectable" reaches 100% (assuming what is given) the non-connectable peers have moved from 1/16 complete to 1/8 complete.
                    It's obviously much worse, then. "non-connectables" have only received 1/256 of the file(s) for a total of 17/256 complete.

                    Damn people. Learn to forward your ports until we get IPv6...

                    On a side note: isn't there a Cisco-created protocol for NAT devices that automatically routes UDP packets when it detects an attempt to connect from the local network? (A -> B [drop packet, punch hole from B to A], B -> A [drop packet, punch hole from A to B], B and A can now freely stream UDP packets to each other assuming the source/destination ports on the packets are in agreement.) Can/does BitTorrent take advantage of this (I've noticed my client opens a UDP port, but I don't know that it uses it)? Assuming I am not making this up, does the automatic routing apply to TCP connections, too?

                    Comment


                    • #11
                      Originally posted by Voltage Spike
                      Not quite.
                      Crap. Mistakes in math suck, but corrections are very cool. Thanks! :-)

                      Comment

                      Working...
                      X