Commit Graph

73 Commits

Author SHA1 Message Date
testscript
ec63f38ba2 fix sendpacket race condition 2023-02-21 20:51:28 +08:00
Kusakabe Si
326d716e80 Fix bug: p2p mode querypeer not work 2021-12-13 04:20:58 +00:00
Kusakabe Si
0258ed4ef0 Update readme.md 2021-12-12 18:58:33 +00:00
Kusakabe Si
3f6001dcdf Move TTL to header, remove packet_len, increase MTU to 1404 2021-12-12 11:35:20 +00:00
Kusakabe Si
ad4ffff985 Update default MTU to 1402 2021-12-12 08:38:13 +00:00
Kusakabe Si
32b674d629 fix invalid packet len detection 2021-12-10 19:44:04 +00:00
Kusakabe Si
5608430139 Check packetsize before process to fix out of range 2021-12-10 18:07:09 +00:00
Kusakabe Si
db8cf4710a reorder config, 3.3, genp2p 2021-12-10 00:09:50 +00:00
Kusakabe Si
17fe0cdae3 fastgen static config, update readme 2021-12-09 20:29:58 +00:00
Kusakabe Si
d555963227 fast spread, static gencfg 2021-12-09 07:46:15 +00:00
Kusakabe Si
c1133c9a69 http based pong_msg 2021-12-03 20:16:18 +00:00
KusakabeSi
f689aadccf new log type: internal 2021-10-09 11:08:28 +00:00
KusakabeSi
776df87879 bugfix: addpeer API. new: tcpsock,udpsock,fd 2021-10-06 08:47:57 +00:00
KusakabeSi
b99b0254ea L2FIB Timeout 2021-10-01 09:01:41 +00:00
KusakabeSi
06b7ea1edb Static mode for supernode,dump packet, update readme, add code of concept 2021-09-30 21:44:07 +00:00
KusakabeSi
a26376cec5 add/del peer dynamically 2021-09-21 20:03:11 +00:00
KusakabeSi
29b53884ba shrink header to inc MTU to 1416 2021-09-21 01:31:11 +00:00
KusakabeSi
72e4ebc91d version check in supernode 2021-09-20 21:00:34 +00:00
KusakabeSi
7951ba2f6a fix supernode ipv6 problem, add error message 2021-09-20 16:27:53 +00:00
KusakabeSi
f8fe962f6a NTP support 2021-08-25 18:20:33 +00:00
KusakabeSi
eae0dc1aa5 Fix bug: remove unnecessary nhTable update 2021-08-25 18:20:30 +00:00
KusakabeSi
89f3069e7f New log option: LogNormal 2021-08-25 18:20:29 +00:00
KusakabeSi
d717d35f64 LinuxTap, not test yet 2021-08-24 12:37:37 +00:00
KusakabeSi
8a1126dcbf Bugfix, p2p mode ok 2021-08-23 16:39:04 +00:00
KusakabeSi
88ef721c1d Bugfix, static mode ok 2021-08-23 08:35:17 +00:00
KusakabeSi
19fe84cf0c bunch of bugfix, supermode OK 2021-08-22 14:19:35 +00:00
KusakabeSi
4939f9f0c4 Not test yet 2021-08-21 14:23:27 +00:00
KusakabeSi
87a62f873b TAP and routeing 2021-08-16 19:37:15 +00:00
KusakabeSi
7801b56b41 rename module 2021-08-16 19:36:46 +00:00
Jason A. Donenfeld
841756e328 device: simplify allowedips lookup signature
The inliner should handle this for us.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-06-03 16:29:43 +02:00
Jason A. Donenfeld
7121927b87 device: add ID to repeated routines
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-05-07 12:21:21 +02:00
Jason A. Donenfeld
efb8818550 device: avoid verbose log line during ordinary shutdown sequence
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-05-07 09:39:06 +02:00
Josh Bleecher Snyder
60a26371f4 device: log all errors received by RoutineReceiveIncoming
When debugging, it's useful to know why a receive func exited.

We were already logging that, but only in the "death spiral" case.
Move the logging up, to capture it always.
Reduce the verbosity, since it is not an error case any more.
Put the receive func name in the log line.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-05-06 11:22:13 +02:00
Jason A. Donenfeld
54dbe2471f conn: reconstruct v4 vs v6 receive function based on symtab
This is kind of gross but it's better than the alternatives.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-12 15:35:32 -06:00
Kristupas Antanavičius
d2fd0c0cc0 device: allocate new buffer in receive death spiral
Note: this bug is "hidden" by avoiding "death spiral" code path by
6228659 ("device: handle broader range of errors in RoutineReceiveIncoming").

If the code reached "death spiral" mechanism, there would be multiple
double frees happening. This results in a deadlock on iOS, because the
pools are fixed size and goroutine might stop until somebody makes
space in the pool.

This was almost 100% repro on the new ARM Macbooks:

- Build with 'ios' tag for Mac. This will enable bounded pools.
- Somehow call device.IpcSet at least couple of times (update config)
- device.BindUpdate() would be triggered
- RoutineReceiveIncoming would enter "death spiral".
- RoutineReceiveIncoming would stall on double free (pool is already
  full)
- The stuck routine would deadlock 'device.closeBindLocked()' function
  on line 'netc.stopping.Wait()'

Signed-off-by: Kristupas Antanavičius <kristupas.antanavicius@nordsec.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-12 11:14:53 -06:00
Josh Bleecher Snyder
10533c3e73 all: make conn.Bind.Open return a slice of receive functions
Instead of hard-coding exactly two sources from which
to receive packets (an IPv4 source and an IPv6 source),
allow the conn.Bind to specify a set of sources.

Beneficial consequences:

* If there's no IPv6 support on a system,
  conn.Bind.Open can choose not to return a receive function for it,
  which is simpler than tracking that state in the bind.
  This simplification removes existing data races from both
  conn.StdNetBind and bindtest.ChannelBind.
* If there are more than two sources on a system,
  the conn.Bind no longer needs to add a separate muxing layer.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-04-02 11:07:08 -06:00
Josh Bleecher Snyder
6228659a91 device: handle broader range of errors in RoutineReceiveIncoming
RoutineReceiveIncoming exits immediately on net.ErrClosed,
but not on other errors. However, for errors that are known
to be permanent, such as syscall.EAFNOSUPPORT,
we may as well exit immediately instead of retrying.

This considerably speeds up the package device tests right now,
because the Bind sometimes (incorrectly) returns syscall.EAFNOSUPPORT
instead of net.ErrClosed.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-03-30 12:41:43 -07:00
Jason A. Donenfeld
4e439ea10e conn: bump to 1.16 and get rid of NetErrClosed hack
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-02-16 21:05:25 +01:00
Jason A. Donenfeld
da32fe328b device: handshake routine writes into encryption queue
Since RoutineHandshake calls peer.SendKeepalive(), it potentially is a
writer into the encryption queue, so we need to bump the wg count.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-02-09 19:26:45 +01:00
Jason A. Donenfeld
6ac1240821 device: do not attach finalizer to non-returned object
Before, the code attached a finalizer to an object that wasn't returned,
resulting in immediate garbage collection. Instead return the actual
pointer.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-02-09 15:37:04 +01:00
Josh Bleecher Snyder
d8dd1f254f device: remove mutex from Peer send/receive
The immediate motivation for this change is an observed deadlock.

1. A goroutine calls peer.Stop. That calls peer.queue.Lock().
2. Another goroutine is in RoutineSequentialReceiver.
   It receives an elem from peer.queue.inbound.
3. The peer.Stop goroutine calls close(peer.queue.inbound),
   close(peer.queue.outbound), and peer.stopping.Wait().
   It blocks waiting for RoutineSequentialReceiver
   and RoutineSequentialSender to exit.
4. The RoutineSequentialReceiver goroutine calls peer.SendStagedPackets().
   SendStagedPackets attempts peer.queue.RLock().
   That blocks forever because the peer.Stop
   goroutine holds a write lock on that mutex.

A background motivation for this change is that it can be expensive
to have a mutex in the hot code path of RoutineSequential*.

The mutex was necessary to avoid attempting to send elems on a closed channel.
This commit removes that danger by never closing the channel.
Instead, we send a sentinel nil value on the channel to indicate
to the receiver that it should exit.

The only problem with this is that if the receiver exits,
we could write an elem into the channel which would never get received.
If it never gets received, it cannot get returned to the device pools.

To work around this, we use a finalizer. When the channel can be GC'd,
the finalizer drains any remaining elements from the channel and
restores them to the device pool.

After that change, peer.queue.RWMutex no longer makes sense where it is.
It is only used to prevent concurrent calls to Start and Stop.
Move it to a more sensible location and make it a plain sync.Mutex.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-02-08 13:02:52 -08:00
Josh Bleecher Snyder
0bcb822e5b device: overhaul device state management
This commit simplifies device state management.
It creates a single unified state variable and documents its semantics.

It also makes state changes more atomic.
As an example of the sort of bug that occurred due to non-atomic state changes,
the following sequence of events used to occur approximately every 2.5 million test runs:

* RoutineTUNEventReader received an EventDown event.
* It called device.Down, which called device.setUpDown.
* That set device.state.changing, but did not yet attempt to lock device.state.Mutex.
* Test completion called device.Close.
* device.Close locked device.state.Mutex.
* device.Close blocked on a call to device.state.stopping.Wait.
* device.setUpDown then attempted to lock device.state.Mutex and blocked.

Deadlock results. setUpDown cannot progress because device.state.Mutex is locked.
Until setUpDown returns, RoutineTUNEventReader cannot call device.state.stopping.Done.
Until device.state.stopping.Done gets called, device.state.stopping.Wait is blocked.
As long as device.state.stopping.Wait is blocked, device.state.Mutex cannot be unlocked.
This commit fixes that deadlock by holding device.state.mu
when checking that the device is not closed.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-02-08 10:32:07 -08:00
Josh Bleecher Snyder
9c75f58f3d device: remove device.state.stopping from RoutineHandshake
It is no longer necessary.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-02-08 08:18:32 -08:00
Josh Bleecher Snyder
84a42aed63 device: remove device.state.stopping from RoutineDecryption
It is no longer necessary, as of 454de6f3e64abd2a7bf9201579cd92eea5280996
(device: use channel close to shut down and drain decryption channel).

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-02-08 08:18:32 -08:00
Jason A. Donenfeld
beb25cc4fd device: use new model queues for handshakes
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-01-29 18:24:45 +01:00
Jason A. Donenfeld
9263014ed3 device: simplify peer queue locking
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-01-29 16:21:53 +01:00
Jason A. Donenfeld
d4112d9096 global: bump copyright
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-01-28 17:52:15 +01:00
Jason A. Donenfeld
1b092ce584 device: get rid of nonce routine
This moves to a simple queue with no routine processing it, to reduce
scheduler pressure.

This splits latency in half!

benchmark                  old ns/op     new ns/op     delta
BenchmarkThroughput-16     2394          2364          -1.25%
BenchmarkLatency-16        259652        120810        -53.47%

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-01-27 18:38:27 +01:00
Jason A. Donenfeld
d669c78c43 device: combine debug and info log levels into 'verbose'
There are very few cases, if any, in which a user only wants one of
these levels, so combine it into a single level.

While we're at it, reduce indirection on the loggers by using an empty
function rather than a nil function pointer. It's not like we have
retpolines anyway, and we were always calling through a function with a
branch prior, so this seems like a net gain.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-01-26 23:05:48 +01:00
Josh Bleecher Snyder
7139279cd0 device: change logging interface to use functions
This commit overhauls wireguard-go's logging.

The primary, motivating change is to use a function instead
of a *log.Logger as the basic unit of logging.
Using functions provides a lot more flexibility for
people to bring their own logging system.

It also introduces logging helper methods on Device.
These reduce line noise at the call site.
They also allow for log functions to be nil;
when nil, instead of generating a log line and throwing it away,
we don't bother generating it at all.
This spares allocation and pointless work.

This is a breaking change, although the fix required
of clients is fairly straightforward.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-01-26 22:40:20 +01:00