blob: fba745d232ded3c8f3073a1e1af1b0bb653ee54e [file] [log] [blame]
Austin Schuh8d0a2852019-12-28 22:54:28 -08001These are more random design notes I need to keep track of.
2
3Perhaps counterWork() should be replaced with direct counter
4manipulation. This violates the functional style of the state
5functions, but only a little bit...
6
7IRRELEVANT: The vtag return value should be subsumed into repl.
8IRRELEVANT: sctp_make_chunk() should NOT be called directly from these
9IRRELEVANT: functions.
10
11I am very unhappy with retval->link. That means a LOT of copying.
12
13DONE: Basic principle for host or network byte order:
14DONE: Network byte order should be as close to the network as
15DONE: possible.
16DONE: This means that the first routine to manipulate a particular header
17DONE: should convert from network byte order to host byte order as
18DONE: soon as it removes it gets it from the next lowest layer.
19DONE: Outbound, the last routine to touch a header before passing it
20DONE: to the next lower layer should convert it to network order. For
21DONE: queues, the routine at the top (closer to user space) does the
22DONE: conversion--inbound queues are converted to host order by the
23DONE: reader, outbound queues are converted to network order by the
24DONE: writer.
25DONE:
26DONE: Forget that smoke. The problem is that this entails reparsing the
27DONE: header when it comes time to pass it to the lower layer (e.g. you need
28DONE: to check the SCTP header for optional fields). The code which fills
29DONE: in a field should put it in network order.
30DONE:
31DONE: POSSIBLY on inbound, the code which parses the header should convert
32DONE: it to host order... But on outbound, packets should ALWAYS be in
33DONE: network byte order!
34
35
36OK, we need to add some stream handling. This means that we are
37updating sctp_create_asoc() among many other functions. I think we
38want some functions for dereferencing streams...
39
40
41DONE: NOTES FOR TSNMap
42DONE:
43DONE: Variables:
44DONE: uint8_t *TSNMap Array counting #chunks with each TSN
45DONE: uint8_t *TSNMapEnd TSNMap+TSN_MAP_SIZE
46DONE: uint8_t *TSNMapOverflow counters for TSNMapBase+TSN_MAP_SIZE;
47DONE: uint8_t *TSNMapCumulativePtr Cell for highest CumulativeTSNAck
48DONE: uint32_t TSNMapCumulative Actual TSN for *TSNMapCumulativePtr
49DONE: uint32_t TSNMapBase Actual TSN for *TSNMap
50DONE: long TSNMapGap chunk.TSN - TSNMapBase
51DONE:
52DONE: Constants:
53DONE: TSN_MAP_SIZE
54DONE:
55DONE: TSNMap and TSNMapOverflow point at two fixed buffers each of length
56DONE: TSN_MAP_SIZE. When TSNMapCumulativePtr passes TSNMapEnd (i.e. we send
57DONE: the SACK including that value), we swap TSNMap and TSNMapOverflow,
58DONE: clearing TSNMap.
59DONE:
60DONE: This work should be done OUTSIDE the state functions, as it requires
61DONE: modifying the map. It is sufficient for the state function to return
62DONE: TSNMapGap. Take care that TSNMapGap is never 0--we reserve this value
63DONE: to mean "no TSNMapGap".
64
65
66DONE: FIGURE THIS OUT--which structures represent PEER TSN's and which
67DONE: structures represent OUR TSN's.
68DONE:
69DONE: Rename the elements to peerTSN* and myTSN*.
70
71
72ERROR IN Section 6.1:
73
74 Note: The data sender SHOULD NOT use a TSN that is more than
75 2**31 - 1 above the beginning TSN of the current send window.
76
77SHOULD be 2**16-1 because of the GAP ACKs.
78
79ERROR IN 12.2 Parameters necessary per association (i.e. the TCB):
80Ack State : This flag indicates if the next received packet
81 : is to be responded to with a SACK. This is initialized
82 : to 0. When a packet is received it is incremented.
83 : If this value reaches 2 or more, a SACK is sent and the
84 : value is reset to 0. Note: This is used only when no DATA
85 : chunks are received out of order. When DATA chunks are
86 : out of order, SACK's are not delayed (see Section 6).
87
88NOWHERE in Section 6 is this mentioned. We only generate immediate
89SACKs for DUPLICATED DATA chunks. Is this an omission in Section 6 or
90a left-over note in section 12.2?
91
92
93Section 6.1:
94
95 Before an endpoint transmits a DATA chunk, if any received DATA
96 chunks have not been acknowledged (e.g., due to delayed ack), the
97 sender should create a SACK and bundle it with the outbound DATA
98 chunk, as long as the size of the final SCTP packet does not exceed
99 the current MTU. See Section 6.2.
100
101I definately won't do this. What AWFUL layering!
102
103We have this REALLY WIERD bugoid. We SACK the first data chunk of the
104second packet containing data chunks. A careful reading of the spec
105suggests that this is legal. It kinda works, but we end up with more
106SACK timeouts than we might otherwise have... The fix is to split off
107the SACK generation code from the TSN-handling code and run it when we
108get either a NEW packet, or an empty input queue.
109
110
111
112OK: Section 6.2 does not explicitly discuss stopping T3-rtx. The worked
113OK: example suggests that T3-rtx should be canceled when the SACK is
114OK: lined up with the data chunk... Ah! Section 6.3...
115
116
117We really ought to do a sctp_create_* and sctp_free_* for all of the
118major objects including SCTP_transport.
119
120{DONE: Copy af_inet.c and hack it to support SCTP.
121
122If we were going to do SCTP as a kernel module, we'd do this:
123
124We can then socket.c:sock_unregister() the whole INET address family
125and then sock_register() our hacked af_inet...
126}
127
128
129SCTP_ULP_* is really two groups of things--request types and response
130types...
131
132DONE: We want to know whether the arguments to bind in sock.h:struct proto
133DONE: are user space addresses or kernel space addresses. To do that we
134DONE: want to find the tcp bind call. To do THAT we are looking for the
135DONE: place that struct proto *prot gets filled in for a TCP struct sock.
136
137
138API issue--how do you set options per association? Normal setsockopt
139will operate on an endpoint. This is mostly an issue for the
140UDP-style api. The current solution (v02) is that all associations on
141a single socket should all have the same options. I still don't like
142this.
143
144Write a free_endpoint(). Remember to free debug_name if allocated...
145
146DONE: Make sure that the API specifies a way for sendto() to use some kind
147DONE: of opaque identifier for the remote endpoint of an association. As
148DONE: observed before, it is a bad thing to use an IP address/port pair as
149DONE: the identifier for the remote endpoint...
150
151General BUG--sctp_bind() needs to check to see if somebody else is
152already using this transport address (unless REUSE_ADDR is set...)...
153
154sctp_do_sm() is responsible for actually discarding packets. We need
155a sctp_discard_packet_from_inqueue().
156
157Be sure to schedule the top half handling in sctp_input.c:sctp_v4_rcv().
158
159Keycode 64 is Meta_L, should be Backspace (or whatever that really
160is)...
161
162DONE: Should sctp_transmit_packet() clone the skb? [Yes. In fact we
163DONE: need a deep copy because of a bug in loopback. This problem
164DONE: sort of goes away with the creation of SCTP_packet.]
165
166- memcpy_toiovec() is for copying from a blob to an iovec...
167- after(), before(), and between() are for comparing 32bit wrapable
168 numbers...
169
170Where do theobromides live? Are they fat soluable?
171
172printf "D %x\nC %x\nI %x\nP %x\nT %x\n", retval->skb->data, retval->chunk_hdr, retval->subh.init_hdr, retval->param_hdr, retval->skb->tail
173
174
175set $chunk = retval->repl->chunk_hdr
176set $init = (struct sctpInitiation *)(sizeof(struct sctpChunkDesc) + (uint8_t *)$chunk)
177set $param1 = (struct sctpParamDesc *)(sizeof(struct sctpInitiation) + (uint8_t *)$init)
178set $param2 = (struct sctpParamDesc *)(ntohs($param1->paramLength) + (uint8_t *)$param1)
179set $sc = (struct sctpStateCookie *)$param2
180
181DONE: run_queue sctp_tq_sideffects needs while wrapper.
182
183OK: Important structures:
184OK: protocol.c: struct inet_protocol tcp_protocol (IP inbound linkage)
185OK: tcp_ipv4.c: struct proto tcp_prot (exceptions to inet_stream_ops)
186OK: af_inet.c: struct proto_ops inet_stream_ops (sockets interface)
187
188Another unimplemented feature: sctp_sendmsg() should select an
189ephemeral port if a port is not already set...
190
191Path MTU stuff: Send shutdown with rewound CumuTSNack. Is this a
192protocol violation?
193
194NO: Use larger TSN increment than 1? Allows subsequencing [This is
195NO: patently illegal. The correct solution involves MTU calculations...]
196
197Lowest of 3 largest MTU's for fragmentation? Probably.
198Allows 2 RWINs worth of backup?
199
200Immediate heartbeat on secondary when primary fails?
201(Use fastest response on heartbeat to select new primary, keeping MTU in mind)
202This is probably illegal. v13 added stricter rules about generating
203heartbeats.
204
205[p- use 3 largest RWINs to select...]
206
207[jm- pick top 3 thruputs (RWIN/Latency), pick lowest MTU for the new primary
208 address ]
209
210
211Here is what we did to set up the repository:
212
213$ cd /usr/src/linux_notes
214$ bzcat ~/linux-2.4.0-test11.tar.bz2 | tar xfp -
215$ CVSROOT=:pserver:knutson@postmort.em.cig.mot.com:/opt/cvs
216$ export CVSROOT
217$ cd linux
218$ cvs import -m "plain old 2.4.0 test11" linux knutson start
219[Note that this EXCLUDES net/core.]
220$ cd ..
221$ mv linux linux-2.4.0-test11
222$ cvs co linux
223$ cd linux-2.4.0-test11/net
224$ tar cfv - core | (cd ../../linux/net;tar xfp -)
225$ cd ../../linux
226$ cvs add net/core
227$ cvs add net/core/*.c
228$ cvs add net/core/Makefile
229$ cd net
230$ cvs commit -m "add core"
231$ cd ..
232[Now we create the branch.]
233$ cvs tag -b uml
234[Move to that branch.]
235$ cvs update -r uml
236$ touch foo
237$ bzcat ~/patch-2.4.0-test11.bz2 | patch -p1
238$ for a in $(find . -newer foo | grep -v CVS); do echo $a; cvs add $a; done 2>&1 | tee ../snart
239$ cvs commit -m "UML patch for 2.4.0-test11"
240$ cvs tag latest_uml
241
2422001 Jan 11
243When we close the socket, it shouldn't de-bind the endpoint. Any new
244socket attempting to bind that endpoint should get an error until that
245endpoint finally dies (from all of its associations dying).
246
247This issue comes up with the question of what should happen when we
248close the socket and attempt to immediately open a new socket and bind
249the same endpoint. Currently, we could bind the same endpoint in SCTP
250terms which would be a new endpoint in data structure terms and buy
251ourselves some confusion.
252
253DONE: Tue Jan 16 23:08:51 CST 2001
254DONE:
255DONE: We find that when we closed the socket (and nulled the ep->sk
256DONE: reference to it), we caused problems later on with chunks created for
257DONE: transmit. When we looked at TCP, we found that closing a TCP socket
258DONE: does not destroy it immediately--TCP also has post-close transactions.
259DONE:
260DONE: Solution: We use the ep->moribund flag to indicate when the socket is
261DONE: closed and do not immediately null the reference in ep.
262
263Wed Jan 17 01:21:40 CST 2001
264
265What happens when loop1 == loop2 in funtest1b (i.e., when the source &
266destination endpoints are identical)? We found out. You get a *real*
267simultaneous init and a burning desire to designate two loop addresses
268so you don't inadvertently put yourself in the same situation again.
269
270We will investigate more later, as this situation promises to test a
271potential weak point in the protocol (cf. siminit above).
272
273Tue Jan 30 14:50:39 CST 2001
274vendor: Linus
275release tag: linux-2_4_1
276
277DONE: We really ought to have a small utility functions file for test stuff
278DONE: (both live kernel and test frame).
279
280Here are all the timers:
281T1-init (per association)
282T1-cookie (per association)
283T3-rtx (per destination)
284heartbeat timer (per association)
285T2-shutdown (per association)
286?Per Destination Timer? (presumed to be T3-rtx)
287
288Mark each chunk with the transport it was transmitted on.
289
290When we transmit a chunk, we turn on the rtx timer for the destination
291if not on already. The chunk is then copied to q->transmitted. When
292we receive a sack, we turn off each timer corresponding to a TSN ACK'd
293by the SACK CTSN. This is because either everything got through, or
294the chunk outbound longest for a given destination got through.
295We then start the timers for destinations which still have chunks on
296q->transmitted, after moving the appropriate chunks to q->sacked.
297
298When a rtx timer expires for a destination, all the chunks on
299q->transmitted for that destination get moved to q->retransmit,
300which then get transmitted (a: at that time, b: when any chunks are
301transmitted, retransmissions go first, c: other).
302
303WHEN PUSHING A CHUNK FOR TRANMISSION
304
305
306
307WHEN TRANSMITTING A CHUNK
308Assign a TSN (if it doesn't already have one).
309Select a transport.
310If the T3-rtx for the transport is not running, start it.
311Make a copy to send. Move the original to q->transmitted.
312
313WHEN PROCESSING A SACK
314Walk q->transmitted, moving things to q->sacked if they were sacked.
315
316Walk chunk through q->sacked.
317 if chunk->TSN <= CTSN {
318 stop chunk->transport->T3RTX
319 free the chunk
320 }
321
322
323WHEN RTX TIMEOUT HAPPENS
324Walk chunk through q->transmitted
325 if chunk->transport is the one that timed out,
326 move chunk to q->retransmit.
327Trigger transmission.
328
329
330DONE: Cases for transport selection:
331DONE: 1) <silent>L</silent>User is idiot savant, picks path
332DONE: 2) Transmit on primary path
333DONE: 3) Retransmit on secondary path
334
335sctp_add_transport() does not check to see if the transport we are
336adding already exists. This COULD lead to having to fail the same
337transport address twice (or more...). A valid INIT packet will not
338list the same address twice (in which case the OTHER guy is screwing
339himself) and we haven't implemented add_ip.
340
341THE PLAN (for adding lost packet handling):
342DONE: Initialize the timer for each transport when the transport is created.
343Generate timer control events according to 6.3.2.
344Write the state function for 6.3.3.
345Write the timer side-effects function.
346
347 Here are random things we would put in an SCTP_packet:
348
349 SCTP header contents:
350 sh->source = htons(ep->port);
351 sh->destination = htons(asoc->c.peerInfo.port);
352 sh->verificationTag = htonl(asoc->c.peerInfo.init.initiateTag);
353 A list of of chunks
354 The total size of the chunks (incl padding)
355
356 Here are random things we would do to an SCTP_packet:
357
358 sctp_chunk_fits_in_packet(packet, chunk, transport)
359 sctp_append_chunk(packet, chunk)
360 sctp_transmit_packet(packet, transport)
361 INIT_PACKET(asoc, &packet)
362
363
364
365
366/* Try to send a chunk down to the network. */
367int
368sctp_commit_chunk_to_network(struct SCTP_packet *payload,
369 struct SCTP_chunk *chunk,
370 struct SCTP_transport *transport)
371{
372 int transmitted;
373 transmitted = sctp_append_chunk(payload, chunk, transport)) {
374 switch(transmitted) {
375 case SCTP_XMIT_PACKET_FULL:
376 case SCTP_XMIT_RWND_FULL:
377 sctp_transmit_packet(...);
378 INIT_PACKET(payload);
379 transmitted = sctp_append_chunk(payload, chunk, transport);
380 break;
381 default:
382 break; /* Default is to do nothing. */
383 }
384 return(transmitted);
385}
386
387sctp_append_chunk can fail with either SCTP_XMIT_RWND_FULL,
388SCTP_XMIT_MUST_FRAG (PMTU_FULL), or SCTP_XMIT_PACKET_FULL.
389
390
391/* This is how we handle the rtx_timeout single-packet-transmit. */
392 if (pushdown_chunk(payload, chunk, transport)
393 && rtx_timeout) {
394 return(error);
395 }
396
397
398Thu Apr 5 16:04:09 CDT 2001
399Our objective here is to replace the switch in inet_create() with a
400table with register/unregister methods.
401
402#define PROTOSW_PREV
403#define PROTOSW_NEXT
404
405struct inet_protosw inetsw[] = {
406 {list: {next: PROTOSW_NEXT,
407 prev: PROTOSW_PREV,
408 },
409 type: SOCK_STREAM,
410 protocol: IPPROTO_TCP,
411 prot4: &tcp_prot,
412 prot6: &tcpv6_prot,
413 ops4: &inet_stream_ops,
414 ops6: &inet6_stream_ops,
415
416 no_check: 0,
417 reuse: 0,
418 capability: -1,
419 },
420
421#if defined(CONFIG_IP_SCTP) || defined(CONFIG_IP_SCTP_MODULE)
422 {type: SOCK_SEQPACKET,
423 protocol: IPPROTO_SCTP,
424 prot4: &sctp_prot,
425 prot6: &sctpv6_prot,
426 ops4: &inet_seqpacket_ops,
427 ops6: &inet6_seqpacket_ops,
428
429 no_check: 0,
430 reuse: 0,
431 capability: -1,
432 },
433
434 {type: SOCK_STREAM,
435 protocol: IPPROTO_SCTP,
436 prot4: &sctp_conn_prot,
437 prot6: &sctpv6_conn_prot,
438 ops4: &inet_stream_ops,
439 ops6: &inet6_stream_ops,
440
441 no_check: 0,
442 reuse: 0,
443 capability: -1,
444 },
445#endif /* CONFIG_IP_SCTP || CONFIG_IP_SCTP_MODULE */
446
447 {type: SOCK_DGRAM,
448 protocol: IPPROTO_UDP,
449 prot4: &udp_prot,
450 prot6: &udpv6_prot,
451 ops4: &inet_dgram_ops,
452 ops6: &inet6_dgram_ops,
453
454 no_check: UDP_CSUM_DEFAULT,
455 reuse: 0,
456 capability: -1,
457 },
458
459
460 {type: SOCK_RAW,
461 protocol: IPPROTO_WILD, /* wildcard */
462 prot4: &raw_prot,
463 prot6: &rawv6_prot,
464 ops4: &inet_dgram_ops,
465 ops6: &inet6_dgram_ops,
466
467 no_check: UDP_CSUM_DEFAULT,
468 reuse: 1,
469 capability: CAP_NET_RAW,
470 },
471
472}; /* struct inet_protosw inetsw */
473
474Here are things that need to go in that table:
475
476The first two fields are the keys for the table.
477struct inet_protosw {
478 struct list_head list;
479 unsigned short type;
480 int protocol; /* This is the L4 protocol number. */
481 struct proto *prot;
482 struct proto_ops *ops;
483
484 char no_check;
485 unsigned char reuse;
486 int capability;
487};
488
489Set type to SOCK_WILD to represent a wildcard.
490Set protocol to IPPROTO_WILD to represent a wildcard.
491Set no_check to 0 if we want all checksums.
492Set reuse to 0 if we do not want to set sk->reuse.
493Set 'capability' to -1 if no special capability is needed.
494
495
496* protocol = IPPROTO_TCP; /* Layer 4 proto number */
497* prot = &tcp_prot; /* Switch table for this proto */
498* sock->ops = &inet_stream_ops; /* Switch tbl for this type */
499
500 sk->num = protocol;
501- sk->no_check = UDP_CSUM_DEFAULT;
502- sk->reuse = 1;
503
504
505
506 if (type == SOCK_RAW && protocol == IPPROTO_RAW)
507 sk->protinfo.af_inet.hdrincl = 1;
508
509
510if (SOCK_RAW == sock->type) {
511 if (!capable(CAP_NET_RAW))
512 goto free_and_badperm;
513 if (!protocol)
514 goto free_and_noproto;
515 prot = &raw_prot;
516 sk->reuse = 1;
517 sk->num = protocol;
518 sock->ops = &inet_dgram_ops;
519 if (protocol == IPPROTO_RAW)
520 sk->protinfo.af_inet.hdrincl = 1;
521} else {
522 lookup();
523}
524
525
526Supporting routines:
527int inet_protosw_register(struct inet_protosw *p);
528int inet_protosw_unregister(struct inet_protosw *p);
529
530
531Tue Apr 10 12:57:45 CDT 2001
532Question: Should SCTP_packet be a dependent subclass of
533SCTP_outqueue, or should SCTP_outqueue and SCTP_packet be independent
534smart pipes which we can glue together?
535
536Answer: We feel that the independent smart pipes make independent
537testing easier.
538
539
540Sat Apr 21 18:17:06 CDT 2001
541OK, here's what's going on. An INIT and an INIT ACK contain almost
542exactly the same parameters, except that an INIT ACK must contain a
543cookie (the one that the initiator needs to echo). In OUR
544implementation, we put the INIT packet in the cookie, so we really do
545most of the processing on the INIT when we get the COOKIE ECHO.
546
547Dilemma:
548
549 When do we convert the INIT to host byte forder? We want to
550 use the same code for all three cases: INIT, INIT ACK, COOKIE
551 ECHO. But if we convert for INIT, then the INIT packet in the
552 cookie (which is processed with the COOKIE ECHO) will be in
553 host byte order.
554
555Options:
556 1. Leave the INIT in network byte order. All access must convert
557 to host byte order as needed. Blech. This violates our
558 existing conventions. Hmm. As long as we don't walk the
559 parameters again, we might be OK...
560
561 2. Add an argument to sctp_process_param() telling whether or
562 not to convert the parameter.
563
564We chose option 1.
565
566We REALLY should unify sctp_make_init() and sctp_make_init_ack(). The
567only difference is the cookie in the INIT ACK.
568
569We might one day need a version of sctp_addto_chunk() called
570sctp_addto_param() which does NOT add extra padding.
571
572How can we get the initial TSN in sctp_unpack_cookie without first
573having processed the INIT packet buried in the cookie?
574
575Sat Apr 28 15:03:48 CDT 2001
576This MIGHT be a bug--look for places we use sizeof(struct iphdr)--
577possibly we might need to grub around in the sk_buff structure
578to find the TRUE length of the iphdr (including options).
579One of the places is where we initialize a struct SCTP_packet--we
580really need to know how big the ip header options are.
581
582I've walked all the way through to the point where we pass INIT_ACK
583down to IP--it looks OK. We DO parse the parameters correctly...
584
585Two bugs--bind loop1a not loop1 in the second bind, and
586sctp_bind_endpoint() should not let you bind the same address twice.
587There should be an approriate errno in the bind man page. EINVAL.
588
589
590Tue May 15 15:35:28 CDT 2001
591compaq3_paddedinitackOK.tcp
592 We ignore ABORT.
593datakinectics_2
594 We will send extra data before we get a COOKIE ACK...
595 We really lucked out and this implementation ran fine...
596sun (lost trace)
597 We have an INIT that causes an oops.
598telesoft2_lostsendings.tcp
599telesoft3_spicyinitack.tcp
600 This INIT ACK causes an oops.
601datakinectics_3
602ulticom_3
603 They transmitted GAP reports and we retransmitted a TSN which had
604 been gap ack'd.
605adax2_goodsend.tcp
606 We produce MANY SACK's in a row after delaying way too long.
607 The retransmissions did not get bundled.
608
609Mon May 21 17:06:56 CDT 2001
610sctp_make_abort() needs to build an SCTP packet, not just a chunk...
611How do we handle cause codes?
612
613I don't know, but here's some random lines pruned from
614sctp_init_packet...
615
616 packet->source_port = asoc->ep->port;
617 packet->destination_port = asoc->peer.port;
618 packet->verificationTag = asoc->peer.i.initiateTag;
619
620
621
622CHANGES NEEDED IN THE LINUX KERNEL to support SCTP:
623* - sockreg
624- both saddr and daddr need to be explicit arguments to the function
625 which takes packets for transmission--move these OUT of the
626 socket... Decouple d_addr from struct sock
627- bindx()
628- glue (elt in sk->tp_pinfo, etc...)
629
630We THINK we have the following items:
631- Per packet frag control (v6)
632- Unified PMTU discovery
633- iov-like sk_buff (to minimize copies)
634
635Fri Aug 17 10:58:35 CDT 2001
636Current thinking:
637
638INADDR_ANY semantics for TCP imply an abstraction of the IP
639interfaces--use any that exist, TCP could care less. This means if
640you add or delete interfaces at a lower level, this doesn't require
641more configuration for TCP.
642
643What this means for SCTP is that INADDR_ANY should also abstract the
644IP interfaces, so that when an association is initiated, we use all
645available IP interfaces, even if some have been added or deleted since
646boot.
647
648At bind, we grub for all interfaces and add them to the endpoint.
649After bind, if an interface is added...we know about it because
650 a) a connection came in on it and we're bound to INADDR_ANY--we add
651 the new transport to the list and use that for the association.
652 b) we initiate and...regrub for all existing interfaces?
653 c) hooks may exist to inform us when new IP interfaces rise
654 phoenix-like from the void (not pointer).
655
656Fri Aug 17 18:24:01 CDT 2001
657
658We need to look in ip6_input.c for IPPROTO_TCP and IPPROTO_UDP. This
659probably needs to use the registration table to do some
660comparisons...
661
662There are several functions in tcp_ipv6.c that we want for sctp. They
663are currently static; we want them to be exported.
664
665Tue Aug 21 13:09:09 CDT 2001
666
667This is a revised list of changes we need in the 2.4.x kernels to
668support SCTP. These are based in part on Bidulock's code:
669
670MUST HAVE:
671+ inet_listen() hook for transport layer
672+ Make tests for SOCK_STREAM more specific (add IPPROTO_TCP checks)
673? Look for references to IPPROTO_TCP and IPPROTO_UDP to see if they
674 are sufficiently uniform.
675
676REALLY OUGHT TO HAVE:
677- bindx() (Daisy)
678- sockreg (done, need to use)
679- netfilter
680
681Interface
682
683+ inet_getname() hook for transport layer?
684 - small & simple hooks here.
685
686+ The ability to append things to proc files (/proc/sys/net
687 specifically...)
688
689TCP-one-true-transport cruft
690- ip_setsockopt() hook (See SOCK_STREAM below.)
691- unified PMTU discovery (allegedly done, need to use)
692(See tcp_sync_mss)
693 SOLUTIONS:
694 - We could move the extension headers and PMTU stuff out to the socket.
695 - We could intercept this socket call in sctp_setsockopt, and do
696 the relevant fix up there. (LY characterizes as "flippin'
697 disgusting")
698 - We could use dst->pmtu (after all, TCP does...sort of...)
699
700Performance
701- decouple d_addr from struct sock (Andi Kleen)
702- zero-copy (done, need to use)
703- per packet IPv6 fragmentation control (allegedly done, need to use)
704 - Why did LY ask for this--he doesn't recall...
705
706---------------------------------------------------------------------------
707Tue Feb 10 11:26:26 PST 2004 La Monte
708
709One significant policy change which 1.0.0 should include is a bias toward
710performance issues.
711
712One principle I want to make sure survives performance improvements is
713readability. In particular, I still would like to put together a site
714hyperlinking LKSCTP with RFC2960 and supporting docs. It should be
715possible to ask "What code implements THIS section?" and "What mandated
716THIS piece of code?"
717
718Consequently, a performance enhancement should either improve readability
719or define a separate clearly marked fast-path. In particular, that class
720of speedups which collapses multiple decisions from different sections of
721the RFCs should probably use separate fast-path code.
722
723Separate fast-path code creates a maintenance problem, so fast-path code
724REALLY needs comments which point explicitly to the slow path. The slow-
725path code should where possible point to the corresponding fast path. It
726then becomes easier to check whether fixes for one path are relevant for
727the other as well.
728
729