This pull request modifies the IdP and cache manager(s) to prevent the sending of app metadata
to the upstream IDP on self-hosted instances.
As a result, the IdP will now load all users from the IdP without filtering based on accountID.
We disable user invites as the administrator's own IDP system manages them.
If there is a difference between local and cached data, we trigger a cache refresh;
as we remove users from the local store and potentially from the remote IDP,
we need to switch the source of truth to the local store to prevent unwanted endless
cache for cases where the removal from the IDP fails or for cases
where the userDeleteFromIDPEnabled got enabled after the first user deletion.
This commit enhances the functionality of the network routes endpoint by introducing a new parameter called `peers_group`. This addition allows users to associate network routes with specific peer groups, simplifying the management and distribution of routes within a network.
Extend the deleted user info with the username
- Because initially, we did not store the user name in the activity db
Sometimes, we can not provide the user name in the API response.
Fix service user deletion
- In case of service user deletion, do not invoke the IdP delete function
- Prevent self deletion
Add a direct write to handle management.json write operation.
Remove empty configuration types to avoid unnecessary fields in the generated management.json file.
Implement user deletion across all IDP-ss. Expires all user peers
when the user is deleted. Users are permanently removed from a local
store, but in IDP, we remove Netbird attributes for the user
untilUserDeleteFromIDPEnabled setting is not enabled.
To test, an admin user should remove any additional users.
Until the UI incorporates this feature, use a curl DELETE request
targeting the /users/<USER_ID> management endpoint. Note that this
request only removes user attributes and doesn't trigger a delete
from the IDP.
To enable user removal from the IdP, set UserDeleteFromIDPEnabled
to true in account settings. Until we have a UI for this, make this
change directly in the store file.
Store the deleted email addresses in encrypted in activity store.
This PR showcases the implementation of additional linter rules. I've updated the golangci-lint GitHub Actions to the latest available version. This update makes sure that the tool works the same way locally - assuming being updated regularly - and with the GitHub Actions.
I've also taken care of keeping all the GitHub Actions up to date, which helps our code stay current. But there's one part, goreleaser that's a bit tricky to test on our computers. So, it's important to take a close look at that.
To make it easier to understand what I've done, I've made separate changes for each thing that the new linters found. This should help the people reviewing the changes see what's going on more clearly. Some of the changes might not be obvious at first glance.
Things to consider for the future
CI runs on Ubuntu so the static analysis only happens for Linux. Consider running it for the rest: Darwin, Windows
The ephemeral manager keep the inactive ephemeral peers in a linked list. The manager schedule a cleanup procedure to the head of the linked list (to the most deprecated peer). At the end of cleanup schedule the next cleanup to the new head.
If a device connect back to the server the manager will remote it from the peers list.
The API authentication with PATs was not considering different userIDClaim
that some of the IdPs are using.
In this PR we read the userIDClaim from the config file
instead of using the fixed default and only keep
it as a fallback if none in defined.
With this fix, all nested slices and pointers will be copied by value.
Also, this fixes tests to compare the original and copy account by their
values by marshaling them to JSON strings.
Before that, they were copying the pointers that also passed the simple `=` compassion
(as the addresses match).
For better auditing this PR adds a dashboard login event to the management service.
For that the user object was extended with a field for last login that is not actively saved to the database but kept in memory until next write. The information about the last login can be extracted from the JWT claims nb_last_login. This timestamp will be stored and compared on each API request. If the value changes we generate an event to inform about a login.
For peer propagation this commit triggers
network map update in two cases:
1) peer login
2) user AutoGroups update
Also it issues new activity message about new user group
for peer login process.
Previous implementation only adds JWT groups to user. This fix also
removes JWT groups from user auto assign groups.
Pelase note, it also happen when user works with dashboard.
Enhancements to Peer Group Assignment:
1. Auto-assigned groups are now applied to all peers every time a user logs into the network.
2. Feature activation is available in the account settings.
3. API modifications included to support these changes for account settings updates.
4. If propagation is enabled, updates to a user's auto-assigned groups are immediately reflected across all user peers.
5. With the JWT group sync feature active, auto-assigned groups are forcefully updated whenever a peer logs in using user credentials.
Enhance the user experience by enabling authentication to Netbird using Single Sign-On (SSO) with any Identity Provider (IDP) provider. Current client offers this capability through the Device Authorization Flow, however, is not widely supported by many IDPs, and even some that do support it do not provide a complete verification URL.
To address these challenges, this pull request enable Authorization Code Flow with Proof Key for Code Exchange (PKCE) for client logins, which is a more widely adopted and secure approach to facilitate SSO with various IDP providers.
This fixes the test logic creates copy of account with empty id and
re-pointing the indices to it.
Also, adds additional check for empty ID in SaveAccount method of FileStore.
* Optimize rules with All groups
* Use IP sets in ACLs (nftables implementation)
* Fix squash rule when we receive optimized rules list from management
* Check links of groups before delete it
* Add delete group handler test
* Rename dns error msg
* Add delete group test
* Remove rule check
The policy cover this scenario
* Fix test
* Check disabled management grps
* Change error message
* Add new activity for group delete event
* Extend protocol and firewall manager to handle old management
* Send correct empty firewall rules list when delete peer
* Add extra tests for firewall manager and uspfilter
* Work with inconsistent state
* Review note
* Update comment
Add new feature to notify the user when new client route has arrived.
Refactor the initial route handling. I move every route logic into the route
manager package.
* Add notification management for client rules
* Export the route notification for Android
* Compare the notification based on network range instead of id.
* Avoid storing account if no peer meta or expiration change
* remove extra log
* Update management/server/peer.go
Co-authored-by: Misha Bragin <bangvalo@gmail.com>
* Clarify why we need to skip account update
---------
Co-authored-by: Misha Bragin <bangvalo@gmail.com>
The new functionality allows blocking a user in the Management service.
Blocked users lose access to the Dashboard, aren't able to modify the network map,
and all of their connected devices disconnect and are set to the "login expired" state.
Technically all above was achieved with the updated PUT /api/users endpoint,
that was extended with the is_blocked field.
Refactored updateServerStates and calculateState
added some checks to ensure we are not sending connecting on context canceled
removed some state updates from the RunClient function
Some IDP requires different scope requests and
issue access tokens for different purposes
This change allow for remote configurable scopes
and the use of ID token
Some IDP use different audience for different clients.
This update checks HTTP and Device authorization flow audience values.
---------
Co-authored-by: Givi Khojanashvili <gigovich@gmail.com>
Default Rego policy generated from the rules in some cases is broken.
This change fixes the Rego template for rules to generate policies.
Also, file store load constantly regenerates policy objects from rules.
It allows updating/fixing of the default Rego template during releases.
Check SSO support by calling the internal.GetDeviceAuthorizationFlowInfo
Rename LoginSaveConfigIfSSOSupported to SaveConfigIfSSOSupported
Receive device name as input for setup-key login
have a default android name when no context value is provided
log non parsed errors from management registration calls
Fix the status indication in the client service. The status of the
management server and the signal server was incorrect if the network
connection was broken. Basically the status update was not used by
the management and signal library.
Rego policy migration clears the rules property of the file storage, but it does not allow rollback management upgrade, so this changes pre-saves rules in the file store and updates it from the policies.
When peer login expires, all remote peers are updated to exclude the peer from connecting.
Once a peer re-authenticates, the remote peers are not updated.
This peer fixes the behavior.
The peer login expiration ACL check introduced in #714
filters out peers that are expired and agents receive a network map
without that expired peers.
However, the agents should see those peers in status "Disconnected".
This PR extends the Agent <-> Management protocol
by introducing a new field OfflinePeers
that contain expired peers. Agents keep track of those and display
then just in the Status response.
The Management gRPC API has too much business logic
happening while it has to be in the Account manager.
This also needs to make more requests to the store
through the account manager.
Bug 1: When calculating the network map, peers added by a setup key
were falling under expiration logic while they shouldn't.
Bug 2: Peers HTTP API didn't return expired peers for non-admin users
because of the expired peer check in the ACL logic.
The fix applies peer expiration checks outside of the ACL logic.
When we delete a peer from an account, we save the account in the file store.
The file store maintains peerID -> accountID and peerKey -> accountID indices.
Those can't be updated when we delete a peer because the store saves the whole account
without a peer already and has no access to the removed peer.
In this PR, we dynamically check if there are stale indices when GetAccountByPeerPubKey
and GetAccountByPeerID.
Goals:
Enable peer login expiration when adding new peer
Expire peer's login when the time comes
The account manager triggers peer expiration routine in future if the
following conditions are true:
peer expiration is enabled for the account
there is at least one peer that has expiration enabled and is connected
The time of the next expiration check is based on the nearest peer expiration.
Account manager finds a peer with the oldest last login (auth) timestamp and
calculates the time when it has to run the routine as a sum of the configured
peer login expiration duration and the peer's last login time.
When triggered, the expiration routine checks whether there are expired peers.
The management server closes the update channel of these peers and updates
network map of other peers to exclude expired peers so that the expired peers
are not able to connect anywhere.
The account manager can reschedule or cancel peer expiration in the following cases:
when admin changes account setting (peer expiration enable/disable)
when admin updates the expiration duration of the account
when admin updates peer expiration (enable/disable)
when peer connects (Sync)
P.S. The network map calculation was updated to exclude peers that have login expired.
Extend HTTP API with Account endpoints to configure global peer login expiration.
GET /api/accounts
PUT /api/account/{id}/
The GET endpoint returns an array of accounts with
always one account in the list. No exceptions.
The PUT endpoint updates account settings:
PeerLoginExpiration and PeerLoginExpirationEnabled.
PeerLoginExpiration is a duration in seconds after which peers' logins will expire.
This PR adds a peer login expiration logic that requires
peers created by a user to re-authenticate (re-login) after
a certain threshold of time (24h by default).
The Account object now has a PeerLoginExpiration
property that indicates the duration after which a peer's
login will expire and a login will be required. Defaults to 24h.
There are two new properties added to the Peer object:
LastLogin that indicates the last time peer successfully used
the Login gRPC endpoint and LoginExpirationEnabled that
enables/disables peer login expiration.
The login expiration logic applies only to peers that were created
by a user and not those that were added with a setup key.
This feature allows using the custom claim in the JWT token as a user ID.
Refactor claims extractor with options support
Add is_current to the user API response
Replace Peer.Key as internal identifier with a randomly generated Peer.ID
in the Management service.
Every group now references peers by ID instead of a public key.
Every route now references peers by ID instead of a public key.
FileStore does store.json file migration on startup by generating Peer.ID and replacing
all Peer.Key identifier references .
Adding --external-ip-map and --dns-resolver-address to up command and shorthand option to global flags.
Refactor get and read config functions with new ConfigInput type.
updated cobra package to latest release.
This PR adds system activity tracking.
The management service records events like
add/remove peer, group, rule, route, etc.
The activity events are stored in the SQLite event store
and can be queried by the HTTP API.
Updated tests, API, and account manager methods
Sync routes to peers in the distribution groups
Added store upgrade by adding the All group to routes that don't have them
Add a usage_limit parameter to the API.
This limits the number of times a setup key
can be used.
usage_limit == 0 indicates the the usage is inlimited.
Use stdout and stderr log path only if on Linux and attempt to create the path
Update status system with FQDN fields and
status command to display the domain names of remote and local peers
Set some DNS logs to tracing
update readme file
Due to peer reconnects when restarting the Management service,
there are lots of SaveStore operations to update peer status.
Store.SavePeerStatus stores peer status separately and the
FileStore implementation stores it in memory.
Added DNS update protocol message
Added sync to clients
Update nameserver API with new fields
Added default NS groups
Added new dns-name flag for the management service append to peer DNS label
This PR simplifies Store and FileStore
by keeping just the Get and Save account methods.
The AccountManager operates mostly around
a single account, so it makes sense to fetch
the whole account object from the store.
This PR brings open-telemetry metrics to the
Management service.
The Management service exposes new HTTP endpoint
/metrics on 8081 port by default.
The port can be changed by specifying
--metrics-port PORT flag when starting the service.
This will help us understand usage on self-hosted deployments
The collection may be disabled by using the flag --disable-anonymous-metrics or
NETBIRD_DISABLE_ANONYMOUS_METRICS in setup.env
This PR brings user invites logic to the Management service
via HTTP API.
The POST /users/ API endpoint creates a new user in the Idp
and then in the local storage.
Once the invited user signs ups, the account invitation is redeemed.
There are a few limitations.
This works only with an enabled IdP manager.
Users that already have a registered account can't be invited.
Add DNS package and Nameserver group objects
Add CRUD operations for Nameserver Groups to account manager
Add Routes and Nameservers to Account Copy method
Run docker tests with timeout and serial flags
Support Generic OAuth 2.0 Device Authorization Grant
as per RFC specification https://www.rfc-editor.org/rfc/rfc8628.
The previous version supported only Auth0 as an IDP backend.
This implementation enables the Interactive SSO Login feature
for any IDP compatible with the specification, e.g., Keycloak.
When creating TLSConfig from provided certificate file, the HTTP/2 support is not enabled.
It works with Certmanager because it adds h2 support.
We enable it the same way when creating TLSConfig from files.
This PR is a part of an effort to use standard ports (443 or 80) that are usually allowed by default in most of the environments.
Right now Management Service runs the Let'sEncrypt manager on port 443, HTTP API server on port 33071,
and a gRPC server on port 33073. There are three separate listeners.
This PR combines these listeners into one.
With this change, the HTTP and gRPC server runs on either 443 with TLS or 80 without TLS
by default (no --port specified).
Let's Encrypt manager always runs on port 443 if enabled.
The backward compatibility server runs on port 33073 (with TLS or without).
HTTP port 33071 is obsolete and not used anymore.
Newly installed agents will connect to port 443 by default instead of port 33073 if not specified otherwise.
Right now Signal Service runs the Let'sEncrypt manager on port 80
and a gRPC server on port 10000. There are two separate listeners.
This PR combines these listeners into one with a cmux lib.
The gRPC server runs on either 443 with TLS or 80 without TLS.
Let's Encrypt manager always runs on port 80.
The Management client will try reconnecting in case.
of network issues or non-permanent errors.
If the device was off-boarded, then the client will stop retrying.
* Send netmask from account network
Added the GetPeerNetwork method to account manager
Pass a copy of the network to the toPeerConfig function
to retrieve the netmask from the network instead of constant
updated methods and added test
* check if the network is the same for 2 peers
* Use expect with BeEquivalentTo
This PR adds support for SSH access through the NetBird network
without managing SSH skeys.
NetBird client app has an embedded SSH server (Linux/Mac only)
and a netbird ssh command.
Introduced an OpenAPI specification.
Updated API handlers to use the specification types.
Added patch operation for rules and groups
and methods to the account manager.
HTTP PUT operations require id, fail if not provided.
Use snake_case for HTTP request and response body
There are a few places where an account is created.
When we create a new account, there should be
some defaults set. E.g. created by and group ALL.
It makes sense to add it in one place to avoid inconsistencies.
* fix(acl): update each peer's network when rule,group or peer changed
* fix(ACL): update network test
* fix(acl): cleanup indexes before update them
* fix(acl): clean up rules indexes only for account
Before this change, NetBird Agent wasn't handling
peer interface configuration changes dynamically.
Also, remote peer configuration changes have
not been applied (e.g. AllowedIPs changed).
Not a very common cause, but still it should be handled.
Now, Agent reacts to PeerConfig changes sent from the
management service and restarts remote connections
if AllowedIps have been changed.
The peer IP allocation logic was allocating sequential peer IP from the 100.64.0.0/10
address block.
Each account is created with a random subnet from 100.64.0.0/10.
The total amount of potential subnets is 64.
The new logic allocates random peer IP
from the account subnet.
This gives us flexibility to add support for
multi subnet accounts without overlapping IPs.
Send Desktop UI client version as user-agent to daemon
This is sent on every login request to the management
Parse the GRPC context on the system package and
retrieves the user-agent
Management receives the new UIVersion field and
store in the Peer's system meta
* rename wiretrustee-signal to netbird-signal
* Rename Signal repositories and source bin
* Adjust docker-compose with signal volume [skip ci]
Co-authored-by: mlsmaycon <mlsmaycon@gmail.com>
Rename documentation and goreleaser build names
Added a migration function for when the old path exists and the new one doesn't
updated the configure.sh to generate the docker-compose with a new path only
if no pre-existing volume with old name exists
UI and CLI Clients are now able to use SSO login by default
we will check if the management has configured or supports SSO providers
daemon will handle fetching and waiting for an access token
Oauth package was moved to internal to avoid one extra package at this stage
Secrets were removed from OAuth
CLI clients have less and better output
2 new status were introduced, NeedsLogin and FailedLogin for better messaging
With NeedsLogin we no longer have endless login attempts
The management will validate the JWT as it does in the API
and will register the Peer to the user's account.
New fields were added to grpc messages in management
and client daemon and its clients were updated
Peer has one new field, UserID,
that will hold the id of the user that registered it
JWT middleware CheckJWT got a splitter
and renamed to support validation for non HTTP requests
Added test for adding new Peer with UserID
Lots of tests update because of a new field
Exposes endpoint under "/users/" that returns information on users.
Calls IDP manager to get information not stored locally (email, name),
which in the case of the managed version is auth0.
* feat(management): add groups
* squash
* feat(management): add handlers for groups
* feat(management): add handlers for groups
* chore(management): add tests for the get group of the management
* chore(management): add tests for save group
* Replace Wiretrustee links and naming
* Upper case for Netbrid in README
* Replace logo
* Dashboard URL to app.netbird.io
Co-authored-by: Misha Bragin <bangvalo@gmail.com>
* Fix IDP Manager config structs with correct tags
When loading the configuration from file
we will use the Auth0ClientConfig and when
sending the post to retrieve a token
we use the auth0JWTRequest with proper tags
Also, removed the idle timeout as it was closing
all idle connections
When account id supplied via claim, we should
handle change of the domain classification.
If category of domain change to private, we
should re-evaluate the private account
* Added Domain Category field and fix store tests
* Add GetAccountByDomain method
* Add Domain Category to authorization claims
* Initial GetAccountWithAuthorizationClaims test cases
* Renamed Private Domain map and index it on saving account
* New Go build tags
* Added NewRegularUser function
* Updated restore to account for primary domain account
Also, added another test case
* Added grouping user of private domains
Also added auxiliary methods for update metadata and domain attributes
* Update http handles get account method and tests
* Fix lint and document another case
* Removed unnecessary log
* Move use cases to method and add flow comments
* Split the new user and existing logic from GetAccountWithAuthorizationClaims
* Review: minor corrections
Co-authored-by: braginini <bangvalo@gmail.com>
* extract claim information from JWT
* get account function
* Store domain
* tests missing domain
* update existing account with domain
* add store domain tests
* test: WIP mocking the grpc server for testing the sending of the client information
* WIP: Test_SystemMetaDataFromClient with mocks, todo:
* fix: failing meta data test
* test: add system meta expectation in management client test
* fix: removing deprecated register function, replacing with new one
* fix: removing deprecated register function from mockclient interface impl
* fix: fixing interface declaration
* chore: remove unused commented code
Co-authored-by: braginini <bangvalo@gmail.com>
* moved wiretrustee version from main to system.info
* added wiretrustee version for all supported platforms
* typo corrected
* refactor: use single WiretrusteeVersion() func to get version of the client
Co-authored-by: braginini <bangvalo@gmail.com>
* update interface tests and configuration messages
* little debug
* little debug on both errors
* print all devs
* list of devices
* debug func
* handle interface close
* debug socks
* debug socks
* if ports match
* use random assigned ports
* remove unused const
* close management client connection when stopping engine
* GracefulStop when management clients are closed
* enable workflows on PRs too
* remove iface_test debug code
* get account id from access token claim
* use GetOrCreateAccountByUser and add test
* correct account id claim
* remove unused account
* Idp manager interface
* auth0 idp manager
* use if instead of switch case
* remove unnecessary lock
* NewAuth0Manager
* move idpmanager to its own package
* update metadata when accountId is not supplied
* update tests with idpmanager field
* format
* new idp manager and config support
* validate if we fetch the interface before converting to string
* split getJWTToken
* improve tests
* proper json fields and handle defer body close
* fix ci lint notes
* documentation and proper defer position
* UpdateUserAppMetadata tests
* update documentation
* ManagerCredentials interface
* Marshal and Unmarshal functions
* fix tests
* ManagerHelper and ManagerHTTPClient
* further tests with mocking
* rename package and custom http client
* sync local packages
* remove idp suffix
* feature: support new management service protocol
* chore: add more logging to track networkmap serial
* refactor: organize peer update code in engine
* chore: fix lint issues
* refactor: extract Signal client interface
* test: add signal client mock
* refactor: introduce Management Service client interface
* chore: place management and signal clients mocks to respective packages
* test: add Serial test to the engine
* fix: lint issues
* test: unit tests for a networkMapUpdate
* test: unit tests Sync update
* feature: introduce NetworkMap to the management protocol with a Serial ID
* test: add Management Sync method protocol test
* test: add Management Sync method NetworkMap field check [FAILING]
* test: add Management Sync method NetworkMap field check [FAILING]
* feature: fill NetworkMap property to when Deleting peer
* feature: fill NetworkMap in the Sync protocol
* test: code review mentions - GeneratePrivateKey() in the test
* fix: wiretrustee client use wireguard GeneratePrivateKey() instead of GenerateKey()
* test: add NetworkMap test
* fix: management_proto test remove store.json on test finish
* chore: [management] - add account serial ID
* Fix concurrency on the client (#183)
* reworked peer connection establishment logic eliminating race conditions and deadlocks while running many peers
* chore: move serial to Network from Account
* feature: increment Network serial ID when adding/removing peers
* chore: extract network struct init to network.go
* chore: add serial test when adding peer to the account
* test: add ModificationID test on AddPeer and DeletePeer
* feature: add User entity to Account
* test: new file store creation test
* test: add FileStore persist-restore tests
* test: add GetOrCreateAccountByUser Accountmanager test
* refactor: rename account manager users file
* refactor: use userId instead of accountId when handling Management HTTP API
* fix: new account creation for every request
* fix: golint
* chore: add account creator to Account Entity to identify who created the account.
* chore: use xid ID generator for account IDs
* fix: test failures
* test: check that CreatedBy is stored when account is stored
* chore: add account copy method
* test: remove test for non existent GetOrCreateAccount func
* chore: add accounts conversion function
* fix: golint
* refactor: simplify admin user creation
* refactor: move migration script to a separate package
* refactor: move goroutine that runs Signal Client Receive to the engine for better control
* chore: fix comments typo
* test: fix golint
* chore: comments update
* chore: consider connection state=READY in signal and management clients
* chore: fix typos
* test: fix signal ping-pong test
* chore: add wait condition to signal client
* refactor: add stream status to the Signal client
* refactor: defer mutex unlock