Skip to content

Commit

Permalink
Release_2021_10_29 (#1893)
Browse files Browse the repository at this point in the history
* Make federated connection functions work with qualified IDs (#1819)

* Add stub for remote connection creation

* Make connection DB functions work with Qualified

* Simplify name of createConnection

* Fix order of arguments in createConnection

* Do not assert on 1-1 conversation names

* Use Local newtype for some more local arguments

Co-authored-by: jschaul <[email protected]>

* Fix detail in stern online help (#1834)

* Spar Polysemy: SAML2 effect (#1827)

* Use Input effect instead of a MonadReader instance

* Remove ReaderT

* Fix package.yaml

* Changelog

* Review responses

* SAML work

Remove undefineds

Interpreting is really hard

Interpret everything

wip

Add toggleCookie to SAML2

Add Now effect

get it compiling

build

Remove HasCreateUUID instance for Spar

* Cleanup

* CanonicalInterpreter and necessary changes

* Rename to SPImpl

* Fake CI

* Another fake CI

* Use catch in polysemy

* Respond to review

* Changelog

* Apply suggestions from code review

Co-authored-by: fisx <[email protected]>

* Hi CI

* make format

Co-authored-by: fisx <[email protected]>

* Spar Polysemy: Fully polysemize Spar (#1833)

* Remove wrapMonadClientSem

Put it into the Cassandra interpreter instead

* Remove MonadIO instance

* Remove MonadError instance

* Remove ExceptT

* Remove Final IO from Spar

* Fix one use of undefined

* Reporter effect; NO MORE IO

* Remove the Spar newtype

* Remove Spar type

* Stylistic cleanup

* Changelog

* Weird rebase problem

* Review comments

* Use hs-certificate master (#1822)

* Use master branch of hs-certificate

The error handling fix
haskell-tls/hs-certificate#125 has been merged, so
we can just use the upstream master now, and later switch to the hackage
package once it is released.

* Servantify legacy addMember endpoint (#1838)

* Use helmfile's parallelism to speed up integration test setup time (#1805)

Motivation: decrease integration setup time, especially for the default two-backend setup. Make use of tooling used elsewhere, and use less of hacky bash scripts. See also https://wearezeta.atlassian.net/wiki/spaces/PS/pages/513573957/CI+runs+of+wire-server+state+and+possible+improvements for a discussion of other CI improvement opportunities.

This should save off about ~5 minutes of setup time for each CI run simply because all helm charts for both backends are now installed in parallel, rather than sequentially. (that is, `make kube-integration-setup` now should be faster than before this PR)

- Create a few FUTUREWORKS in Jira and link to them from the code comments
- Create two helmfiles, one for federation, one for single-backend
- Add helmfile to nix-shell tooling (Helmfile itself comes with a different version of helm; but since so
far things inside nix-shell are only in use for local development, this
should not matter too much. In the future this can be streamlined with
wire-server-deploy to use the same versions everywhere)

* [Federation] Include Remote Connections in Listing All Connections (#1826)

* Expand a test to also include remote connections while listing

* Remove deprecated endpoint for listing convs (#1840)

* Remove deprecated endpoint for listing convs

Also removed the V2 from the name of the endpoint (in the code, not in
the endpoint path).

* Remove /list-conversations from nginx conf

* Remove use of /list-conversations from End2end

* Federation: Allow connecting to remote users (#1824)

One2One conversations are not created yet. This will be worked upon separately.
Legal-hold restrictions are also not dealt with as for now, it will not be allowed to turn on legal-hold and federation at the same point.

Co-authored-by: Stefan Matting <[email protected]>
Co-authored-by: jschaul <[email protected]>
Co-authored-by: Akshay Mankar <[email protected]>

* Fix more swagger validation errors (#1841)

* Fix more swagger validation errors

These could be prevented by turning some lists to sets in the swagger2
package, but for now we simply go through all the schemas in the
`Swagger` structure, and apply `nub` on them.

* Various cleanups of Qualified and related types (#1839)

* Refactor tagged Qualified types

This makes the `Local` and `Remote` type constructor safer, because now
it is not possible to change the domain inside a tagged value using the
`Functor` instance.

* Rename `partitionQualified` to `indexQualified`

* Refactor partitionRemoteOrLocalIds

Also rename it to partitionQualified and swap the order of results.

* Refactor and rename `partitionRemote`

The `partitionRemote` function has been renamed to `indexRemote` for
consistency with `indexQualified`, and it now returns a list of `Remote
[a]`, which preserves the information about the domains being remote.

* Remove some uses of toRemoteUnsafe

* Remove convId from ConversationMetadata

Also change type of toRemoteUnsafe and toLocalUnsafe to just take a `Domain` and
an `a` instead of `Qualified a`.

* Remove one more use of toRemoteUnsafe

* Remove lUnqualified and lDomain

We can simply use the general versions that work for both qualified
tags.

* Remove renderQualified and corresponding test

It was completely unused.

* Use data kinds for Id tags

* Better schema instance for `Qualified` values

* Add CHANGELOG entry

* Create remote 1-1 conversations (#1825)

* Extract function to create UserList

* Add stub for remote 1-1 conversation creation

* Compute remote 1-1 conversation IDs

* ensureConnected now takes a UserList

* Make /conversations/one2one federation-aware

Converted the endpoint for creating 1-1 conversations to the new
conversation ID algorithm, and enabled the endpoint to create 1-1
conversations with federated users.

Note: the case when the conversation needs to be hosted by the remote
domain is still not implemented. We probably need a new RPC for this
case.

* Remove create from UUID Version class

The create function cannot be defined for all UUID versions.

* Introduce V5 UUIDs and use them for 1-1 conv

* Servantify internal endpoint for connect conv

* Make recipient field of connect event qualified

* Extract function to create legacy connect conv

* Add tests for the conversation ID algorithm

* write internal with stubs for data functions

* Implement a function for creating and updating a 1-1 remote conversation

- The function is Galley.API.One2One.iUpsertOne2OneConversation

* use schema-profunctor for json instances

galley-types: no lax

* galley-types rename module to Intra

* galley: remove "these" dep

galley.cabal

* fix impossible example

* remove todo

* un-nameclash: one2OneConvId -> localOne2OneConvId

* remove warning suppression

* brig: add rpc function

* change api: alwyas return a conv id

* Add tests for one2one conversation internal endpoint

* Test remote one2one conversation case

* Update golden tests after change in connect event

* Add CHANGELOG entry

* Remove incorrect comment

Co-authored-by: Marko Dimjašević <[email protected]>
Co-authored-by: Stefan Matting <[email protected]>

* Leave a note with a link to a Jira ticket about a flaky test (#1844)

* Make non-collision test for 1-1 conv ids faster (#1846)

The `anySame` function has quadratic runtime, but here we can use an
`Ord` instance, and just compare the `nubOrd` lists. This also removes a
potential flakyness caused by repeated input pairs (which should be
quite likely to happen, given the low entropy of the UUID generator).

* add comment to test for FUTUREWORK (#1848)

* Fix error in member csv creation (SAML.UserRef decoding error) (#1828)

* Add failing test case.

* Nit-pick.

* Do not git-ignore pem files (at least not all of them).

* Fix error message.

* More detail in scim error responses.

* An idea.

* Implement the idea.

* FUTUREWORK.

* Update One2One conversation when connection status changes (#1850)

* move one2oneConvId to galley-types

* implement updateOne2OneConv and simple test

* add more test cases

* Clarify 403 in test

* add changelog entry

* chore: [charts] Update webapp version (#1836)

Co-authored-by: Zebot <[email protected]>

* chore: [charts] Update team-settings version (#1835)

Co-authored-by: Zebot <[email protected]>

* update to latest SFT. (#1849)

* update to latest SFT.

* Add changelog entry for SFT

Co-authored-by: jschaul <[email protected]>

* Upgrade webapp/team-settings: changelog entries for #1835 and #1836 (#1856)

* Fix SFTD in umbrella chart (#1677)

* Fix SFTD in umbrella chart

* changelog

Co-authored-by: jschaul <[email protected]>

* Move SFTD public IP docs to the top (#1672)

It's the thing people confuse the most. Hopefully people will get it wrong less now

* [charts:sftd] Introduce flag to enable TURN discovery (#1519)

* [charts:sftd] Introduce flag to enable TURN discovery

* -f integrate review feedback

* changelog

Co-authored-by: jschaul <[email protected]>

* Check extended key usage of server certificates (#1855)

* Test that server key usage is checked for fed cert

* Reject certificates without server usage flag

* Access updates affect remote users (#1854)

* Rename NotificationTargets to BotsAndMembers

* Refactor logic to remove users after access update

 - Avoid using lenses and state; since there are only two updates, these
 can be threaded manually pretty easily.
 - Rename the `NotificationTargets` type to `BotsAndMembers`, and use
 that instead of pairs (or triples) in the access update function.

This endpoint is still not properly federation-aware, since remote
members are not removed, and local member removals are not propagated to
remotes.

Co-authored-by: Stefan Matting <[email protected]>

* Re-enable multiple victim when removing members

This is useful to batch removals occurring after an access update to a
conversation.

* Remove and notify remotes on access update

* Access update removal tests

* Remove duplication in test conversation creation

Co-authored-by: Paolo Capriotti <[email protected]>
Co-authored-by: Marko Dimjašević <[email protected]>

* Change tag (#1859)

* Check connections when adding remote users to a conv (#1842)

* Delete stale FUTUREWORK

* Brig: delete deprecated 'GET /i/users/connections-status` endpoint

* brig: Servantify POST /i/users/connection-status

* brig: Add internal endpoint to get qualified connection statuses

* Brig: Support creating accepted connections for tests

The endpoint just creates DB entries without actually contacting the remote
backend. This is very useful when galley tests need a remote connection to exist

* wire-api: roundtrip test for To/FromByteString @relation

The instances were deleted couple of commits ago.

* Check conn between adder and remotes when adding remotes to conv

* Check connection between conversation creator and remote members

* Do connection checking in onConversationCreated in the federation API

* Make existing federation tests succeed again by sprinkling some connections

* Add a (still failing) test for on-conversation-crated

* Add more connections to pass federation API tests

* onConvCreated: Ensure creator of conv is included as other member

* More coverage for onConvCreated

* onConvUpdated: Only allow connected users to add local users

* Add test case: Only unconnected users to add

* Fix integration tests

Co-authored-by: Marko Dimjašević <[email protected]>
Co-authored-by: jschaul <[email protected]>
Co-authored-by: Stefan Matting <[email protected]>
Co-authored-by: Paolo Capriotti <[email protected]>

* Make conversation creator unqualified in on-conversation-created RPC (#1858)

* Unqualify rcOrigId in `on-conversation-created`

Also add some Remote and Local tags to various functions.

* Simplify partitioning in onConversationCreated

* Improve comment about creator ID in RPC

* Ensure creator in the conv domain in tests

Co-authored-by: jschaul <[email protected]>

* Parallelise RPCs (#1860)

* Add runFederatedConcurrently utility

* Paralellise remote conversation notification

* Add Local and Remote tags to profile functions

* Parallelise RPCs for fetching profiles

* Rename indexRemote to bucketRemote

This makes it consistent with indexQualified and bucketQualified.

* Move traverseWithErrors to Util module

* Parallelise claimMultiPrekeyBundles

* Close GRPC client after making a request to a remote federator (#1865)

* Add Resource effect to InternalServer stack

* Ensure GRPC clients are closed after a request

* Allow using kind cluster with imagePullPolicy=Never (#1862)

* Allow using kind cluster with imagePullPolicy=Never

drive-by fix: create namespace if it doesn't exist yet

* Update helm version in nix-shell to fit version used elsewhere

* set kind kubeconfig permissions correctly

* fixup helmfile

* Hi CI

* disable flaky test in gundeck (#1867)

* disable flaky test in gundeck

* Hi CI

* Check connections when creating group and team convs with remote members  (#1870)

* Remove unnecessary remote domain from mock federator

* Remove unnecessary check for remote users' existence in createConv

Since we check for connections, we don't need to also find out if the users
exist.

* Check remote connections when creating team conv

Just like for regular group conversations, do not fetch profiles, and
instead check both local and remote connections.

Also added failure tests for team conversation creation with unconnected
locals or remotes.

* Remove opts argument for mock federator

* Add CHANGELOG entries

Co-authored-by: Paolo Capriotti <[email protected]>

* minor Readme: document usage of helm charts (#1307)

* Support deleting conversations with federated users (#1861)

* Refactor: Use pushConversationEvent

* add onConversationDeleted RPC

* deleteTeamConversation: rpc onConversationDeleted

* Data.deleteConversation: remove remotes

* add changelog entry

* wire-api: extend ConversationAction

* onConversationDeleted -> onConversationUpdated

* fix compilation

* remove duplicated import

* cosmetic change

* fix call to withTempServantMockFederator

* Remove a leftover TODO that was addressed (#1868)

* In Conversation Endpoints Make the members.self ID Qualified (#1866)

* Make the self member's ID qualified
* Simplify conversation view functions
* Unrelated small change: remove a cycle of qualifying a conversation ID in a test
* Introduce qualifyLocal to the BotNet monad

* Changelog script: skip empty sections (#1871)

* Replace shell.nix with a direnv + nixpkgs.buildEnv based setup (#1876)

* Replace shell.nix with a direnv + nixpkgs.buildEnv based setup

* Add instructions on how to use nix-hls.sh from emacs

* Correctly update PATH in .envrc (#1877)

* Introduce 'make flake-PATTERN' (#1875)

Add a 'make flake-PATTERN' target to run a subset of tests multiple times to trigger a failure case in flaky tests. By default the test(s) will run up to 1000 times until a failure occurs, at which point it will stop. Scrolling up on the output will show you how many tests had to run to trigger a failure.

example output:

```
make flake-sso-id
echo 'set -ex' > /tmp/flake.sh
chmod +x /tmp/flake.sh
for i in $(seq 1000); do \
	echo "echo $i" >> /tmp/flake.sh; \
	echo '../../dist/brig-integration -s brig.integration.yaml -i ../integration.yaml -p "sso-id" ' >> /tmp/flake.sh; \
done
INTEGRATION_USE_NGINZ=1 ../integration.sh /tmp/flake.sh
Running tests using mocked AWS services
[cannon] I, Listening on 127.0.0.1:8083
[cannon] I, Listening on 127.0.0.1:8183
[cargohold] I, Listening on 0.0.0.0:8084
[spar] I, logger=cassandra.spar, Known hosts: [datacenter1:rack1:127.0.0.1:9042]
[federator] D, inotify initialized, inotify=<inotify fd=11>
[gundeck] I, Listening on 0.0.0.0:8086
[galley] I, Listening on 127.0.0.1:8085
[spar] I, Listening on 0.0.0.0:8088
[nginz] 127.0.0.1 - - [20/Oct/2021:16:33:50 +0200] "GET /i/status HTTP/1.1" 200 0 "-" "curl/7.71.1" "-" - 2 0.000 - - - - 3cabaf643c510db36a3c989301d73569
all services are up!
++ echo 1
1
++ ../../dist/brig-integration -s brig.integration.yaml -i ../integration.yaml -p sso-id
2021-10-20T14:33:51Z, D, Connecting to 127.0.0.1:9042
2021-10-20T14:33:51Z, I, Known hosts: [datacenter1:rack1:127.0.0.1:9042]
2021-10-20T14:33:51Z, I, New control connection: datacenter1:rack1:127.0.0.1:9042#<socket: 3>
Brig API Integration
  user
    account
      put /i/users/:uid/sso-id: OK (0.82s)

All 1 tests passed (0.83s)
++ echo 2
2
++ ../../dist/brig-integration -s brig.integration.yaml -i ../integration.yaml -p sso-id
2021-10-20T14:33:53Z, D, Connecting to 127.0.0.1:9042
Brig API Integration
  user
    account
      put /i/users/:uid/sso-id: 2021-10-20T14:33:53Z, I, Known hosts: [datacenter1:rack1:127.0.0.1:9042]
2021-10-20T14:33:53Z, I, New control connection: datacenter1:rack1:127.0.0.1:9042#<socket: 3>
OK (0.85s)

All 1 tests passed (0.85s)
++ echo 3
3
++ ../../dist/brig-integration -s brig.integration.yaml -i ../integration.yaml -p sso-id
2021-10-20T14:33:55Z, D, Connecting to 127.0.0.1:9042
Brig API Integration
  user
    account
      put /i/users/:uid/sso-id: 2021-10-20T14:33:55Z, I, Known hosts: [datacenter1:rack1:127.0.0.1:9042]
2021-10-20T14:33:55Z, I, New control connection: datacenter1:rack1:127.0.0.1:9042#<socket: 3>
OK (0.77s)

All 1 tests passed (0.77s)
++ echo 4
4
++ ../../dist/brig-integration -s brig.integration.yaml -i ../integration.yaml -p sso-id
2021-10-20T14:33:56Z, D, Connecting to 127.0.0.1:9042
Brig API Integration
  user
    account
      put /i/users/:uid/sso-id: 2021-10-20T14:33:56Z, I, Known hosts: [datacenter1:rack1:127.0.0.1:9042]
2021-10-20T14:33:56Z, I, New control connection: datacenter1:rack1:127.0.0.1:9042#<socket: 3>
OK (0.79s)

All 1 tests passed (0.79s)
++ echo 5
5
++ ../../dist/brig-integration -s brig.integration.yaml -i ../integration.yaml -p sso-id
```

When a failure happens:

```
++ echo 282
282
++ ../../dist/brig-integration -s brig.integration.yaml -i ../integration.yaml -p sso-id
2021-10-20T14:41:25Z, D, Connecting to 127.0.0.1:9042
[brig] W, logger=cassandra.brig, Server warning: Read 0 live rows and 2102 tombstone cells for query SELECT * FROM brig_test.users_pending_activation WHERE  LIMIT 10000 (see tombstone_warn_threshold)
Brig API Integration
  user
    account
      put /i/users/:uid/sso-id: 2021-10-20T14:41:25Z, I, Known hosts: [datacenter1:rack1:127.0.0.1:9042]
2021-10-20T14:41:25Z, I, New control connection: datacenter1:rack1:127.0.0.1:9042#<socket: 3>
[brig] W, logger=cassandra.brig, Server warning: Read 0 live rows and 2104 tombstone cells for query SELECT * FROM brig_test.users_pending_activation WHERE  LIMIT 10000 (see tombstone_warn_threshold)
FAIL
        Exception: Assertions failed:
         1: 202 =/= 403
         2: updatePhone (PUT /self/phone): failed to update to Phone {fromPhone = "+046965171332989"} - might be a flaky test tracked in https://wearezeta.atlassian.net/browse/BE-526

        Response was:

        Response {responseStatus = Status {statusCode = 403, statusMessage = "Forbidden"}, responseVersion = HTTP/1.1, responseHeaders = [("Transfer-Encoding","chunked"),("Date","Wed, 20 Oct 2021 14:41:27 GMT"),("Server","Warp/3.3.13"),("Content-Encoding","gzip"),("Content-Type","application/json")], responseBody = Just "{\"code\":403,\"message\":\"The given phone number has been blacklisted due to suspected abuse or a complaint.\",\"label\":\"blacklisted-phone\"}", responseCookieJar = CJ {expose = []}, responseClose' = ResponseClose}
        CallStack (from HasCallStack):
          error, called at src/Bilge/Assert.hs:89:5 in bilge-0.22.0-5tCtgpJGKRb38JsbN4shGd:Bilge.Assert
          <!!, called at src/Bilge/Assert.hs:107:19 in bilge-0.22.0-5tCtgpJGKRb38JsbN4shGd:Bilge.Assert
          !!!, called at test/integration/Util.hs:735:3 in main:Util
          updatePhone, called at test/integration/API/User/Account.hs:1230:11 in main:API.User.Account

1 out of 1 tests failed (0.79s)
Terminated
Terminated
[brig] W, logger=cassandra.brig, Server warning: Read 0 live rows and 2106 tombstone cells for query SELECT * FROM brig_test.users_pending_activation WHERE  LIMIT 10000 (see tombstone_warn_threshold)
make: *** [Makefile:114: flake-sso-id] Error 1

```

* updatePhone deflake (#1874)

* updatePhone deflake debugging information

This is about https://wearezeta.atlassian.net/browse/BE-526

I think what's happening is that one test that tests the phone blocking
adds a record into the brig.excluded_phones entry. Then, another,
unrelated test, if unlucky enough to randomly generate a phone number
contained under that prefix, fails in the PUT /self/phone call.

* 1) update integration test output to give better information and link
  to a flaky test description
* 2) change the code to (hopefully) avoid this flake to re-occur.

The changes to integration tests will lead to the following output on
failure:

  user
    account
      put /i/users/:uid/sso-id:
        Exception: Assertions failed:
         1: 202 =/= 403
         2: updatePhone (PUT /self/phone): failed to update to Phone {fromPhone = "+046965171332989"} - might be a flaky test tracked in https://wearezeta.atlassian.net/browse/BE-526

        Response was:

        Response {responseStatus = Status {statusCode = 403, statusMessage = "Forbidden"}, responseVersion = HTTP/1.1, responseHeaders = [("Transfer-Encoding","chunked"),("Date","Wed, 20 Oct 2021 14:41:27 GMT"),("Server","Warp/3.3.13"),("Content-Encoding","gzip"),("Content-Type","application/json")], responseBody = Just "{\"code\":403,\"message\":\"The given phone number has been blacklisted due to suspected abuse or a complaint.\",\"label\":\"blacklisted-phone\"}", responseCookieJar = CJ {expose = []}, responseClose' = ResponseClose}
        CallStack (from HasCallStack):
          error, called at src/Bilge/Assert.hs:89:5 in bilge-0.22.0-5tCtgpJGKRb38JsbN4shGd:Bilge.Assert
          <!!, called at src/Bilge/Assert.hs:107:19 in bilge-0.22.0-5tCtgpJGKRb38JsbN4shGd:Bilge.Assert
          !!!, called at test/integration/Util.hs:735:3 in main:Util
          updatePhone, called at test/integration/API/User/Account.hs:1230:11 in main:API.User.Account

* undo changes in src as they make another test fail

* Add a cleanup line

* fixup

* Hi CI

* Include conv creator is only once in notifications sent to remotes (#1879)

To remove any confusion in the `on-conversation-created` federation API, rename
"members" to "non_creator_members". As the creator is already specified in
"orig_user_id".

Also:
- Add Golden tests for `NewRemoteConversation`
- Add integration tests for creating conversation with remote users

* Optimise remote user deletion (#1872)

Creates two Federation RPCs:

* In brig: on-user-deleted, notify about the connections in chunks of 1000 users.
* In galley: on-user-deleted, notify about the conversations in chunks 1000 conversations

When writing integration tests in brig, we can mock the federator for brig but not galley. As the two RPCs must be made from two separate places. So, we had to mock out galley to be able to test the brig functionality. The galley functionality is tested separately by calling the internal endpoint.

Co-authored-by: Akshay Mankar <[email protected]>
Co-authored-by: Stefan Matting <[email protected]>

* Set federator's default log level to Info (#1882)

* Rename the two federation/on-user-deleted endpoints (#1883)

* Update Federation API conventions doc in prep for on-user-deleted

* brig/galley: Rename the two federation/on-user-deleted endpoints

This is to ensure that they do not overlap. This will hopefully make it easier
to merge brig and galley.

* Extract type level vars for UserDeleteNotificationMax{Conns,Convs}

* Galley polysemy (1/5) - Introduce Sem and "access" effects (#1881)

* Add type variable to Galley monad

This is step 0 in the process of converting galley to effects. We
introduce a phantom type variable `r` in the `Galley` monad, which will
later be used for the effect row.

* Use API instead of DB access in 1-1 conv test

* Monomorphise Data functions

* Avoid MonadUnliftIO in Bilge.RPC

* Remove unneeded MonadLogger constraint

* Introduce fine-grained placeholder effects

This commit introduces several placeholder effects, mostly having to do
with making HTTP requests. All the existing uses of `MonadUnliftIO` are
now either gone, or hidden behind of of these effects, and that made it
possible to get rid of the `MonadUnliftIO` instance for `Galley`.

Also, the `Galley0` type synomym now refers to `Galley` without any
effects, so `runGalley` and related functions now take a `Galley
GalleyEffects` instead.

`Galley0` still has a `MonadUnliftIO` instance, so it can be used as a
temporary crutch to get access to async primitives. Those need to be run
in `Galley0`, and finally lifted to a general `Galley r` monad.
Eventually, the `Galley0` actions will simply be replaced by effect
actions, and the code actually using `MonadUnliftIO` will be relegated
to interpreters.

* Remove MonadMask instance of Galley

This also introduces a `SparAccess` effect and adds a few more
`BrigAccess` and `BotAccess` constraints.

* Remove MonadCatch instance of Galley

* Turn Galley into a Sem newtype

The underlying `Sem` monad in `Galley` is an arbitrary effect stack that
contains at least the effects which replicate the functionality of the
original `Galley` monad. All the functionality has been reimplemented in
terms of `Sem`, so the existing code does not need to be changed at all.

* Allow configuring nginz so it serves the deeplink for apps to discover the backend (#1889)

Allow nginz to serve a deeplink (see also https://docs.wire.com/how-to/associate/deeplink.html )

Co-authored-by: jschaul <[email protected]>

* upgrade webapp to federation-capable (not for production use!) version. (#1892)

* Release 2021_10_29

Co-authored-by: Paolo Capriotti <[email protected]>
Co-authored-by: jschaul <[email protected]>
Co-authored-by: fisx <[email protected]>
Co-authored-by: Sandy Maguire <[email protected]>
Co-authored-by: Marko Dimjašević <[email protected]>
Co-authored-by: Stefan Matting <[email protected]>
Co-authored-by: Akshay Mankar <[email protected]>
Co-authored-by: Stefan Matting <[email protected]>
Co-authored-by: zebot <[email protected]>
Co-authored-by: Zebot <[email protected]>
Co-authored-by: Arian van Putten <[email protected]>
Co-authored-by: Lucendio <[email protected]>
  • Loading branch information
13 people authored Oct 29, 2021
1 parent 143ee9f commit d6b9490
Show file tree
Hide file tree
Showing 260 changed files with 9,054 additions and 4,042 deletions.
5 changes: 5 additions & 0 deletions .envrc
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
env=$(nix-build --no-out-link "$PWD/direnv.nix")
PATH_add "${env}/bin"

# allow local .envrc overrides
[[ -f .envrc.local ]] && source_env .envrc.local
3 changes: 1 addition & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,6 @@ TAGS
.stack-docker-profile
.metadata
*.tix
*.pem
.DS_Store
services/nginz/src
services/.env
Expand Down Expand Up @@ -99,4 +98,4 @@ i.yaml
b.yaml
telepresence.log

/.ghci
/.ghci
62 changes: 62 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,67 @@
<!-- if you're not the release manager, do your edits to changelog under CHANGELOG.d/ -->

# [2021-10-29]

## Release notes

* Upgrade SFT to 2.1.15 (#1849)
* Upgrade team settings to Release: [v4.2.0](https://github.com/wireapp/wire-team-settings/releases/tag/v4.2.0) and image tag: 4.2.0-v0.28.28-1e2ef7 (#1856)
* Upgrade Webapp to image tag: 20021-10-28-federation-m1 (#1856)

## API changes

* Remove `POST /list-conversations` endpoint. (#1840)
* The member.self ID in conversation endpoints is qualified and available as
"qualified_id". The old unqualified "id" is still available. (#1866)

## Features

* Allow configuring nginz so it serve the deeplink for apps to discover the backend (#1889)
* SFT: allow using TURN discovery using 'turnDiscoveryEnabled' (#1519)

## Bug fixes and other updates

* Fix an issue related to installing the SFT helm chart as a sub chart to the wire-server chart. (#1677)
* SAML columns (Issuer, NameID) in CSV files with team members. (#1828)

## Internal changes

* Add a 'make flake-PATTERN' target to run a subset of tests multiple times to trigger a failure case in flaky tests (#1875)
* Avoid a flaky test to fail related to phone updates and improve failure output. (#1874)
* Brig: Delete deprecated `GET /i/users/connections-status` endpoint. (#1842)
* Replace shell.nix with direnv + nixpkgs.buildEnv based setup (#1876)
* Make connection DB functions work with Qualified IDs (#1819)
* Fix more Swagger validation errors. (#1841)
* Turn `Galley` into a polysemy monad stack. (#1881)
* Internal CI tooling improvement: decrease integration setup time by using helmfile. (#1805)
* Depend on hs-certificate master instead of our fork (#1822)
* Add internal endpoint to insert or update a 1-1 conversation. This is to be used by brig when updating the status of a connection. (#1825)
* Update helm to 3.6.3 in developer tooling (nix-shell) (#1862)
* Improve the `Qualified` abstraction and make local/remote tagging safer (#1839)
* Add some new Spar effects, completely isolating us from saml2-web-sso interface (#1827)
* Convert legacy POST conversations/:cnv/members endpoint to Servant (#1838)
* Simplify mock federator interface by removing unnecessary arguments. (#1870)
* Replace the `Spar` newtype, instead using `Sem` directly. (#1833)

## Federation changes

* Remove remote guests as well as local ones when "Guests and services" is disabled in a group conversation, and propagate removal to remote members. (#1854)
* Check connections when adding remote users to a local conversation and local users to remote conversations. (#1842)
* Check connections when creating group and team conversations with remote members. (#1870)
* Server certificates without the "serverAuth" extended usage flag are now rejected when connecting to a remote federator. (#1855)
* Close GRPC client after making a request to a remote federator. (#1865)
* Support deleting conversations with federated users (#1861)
* Ensure that the conversation creator is included only once in notifications sent to remote users (#1879)
* Allow connecting to remote users. One to one conversations are not created yet. (#1824)
* Make federator's default log level Info (#1882)
* The creator of a conversation now appears as a member when the conversation is fetched from a remote backend (#1842)
* Include remote connections in the response to `POST /list-connections` (#1826)
* When a user gets deleted, notify remotes about conversations and connections in chunks of 1000 (#1872, #1883)
* Make federated requests to multiple backends in parallel. (#1860)
* Make conversation ID of `RemoteConversation` unqualified and move it out of the metadata record. (#1839)
* Make the conversation creator field in the `on-conversation-created` RPC unqualified. (#1858)
* Update One2One conversation when connection status changes (#1850)

# [2021-10-01]

## Release notes
Expand Down
9 changes: 5 additions & 4 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -234,7 +234,7 @@ libzauth:
.PHONY: hie.yaml
hie.yaml: stack-dev.yaml
stack build implicit-hie
stack exec gen-hie | nix-shell --command 'yq "{cradle: {stack: {stackYaml: \"./stack-dev.yaml\", components: .cradle.stack}}}" > hie.yaml'
stack exec gen-hie | yq "{cradle: {stack: {stackYaml: \"./stack-dev.yaml\", components: .cradle.stack}}}" > hie.yaml

.PHONY: stack-dev.yaml
stack-dev.yaml:
Expand Down Expand Up @@ -311,7 +311,7 @@ release-chart-%:
.PHONY: guard-tag
guard-tag:
@if [ "${DOCKER_TAG}" = "${USER}" ]; then \
echo "Environment variable DOCKER_TAG not set to non-default value. Re-run with DOCKER_TAG=<something>. Try using 'make latest-brig-tag' for latest develop docker image tag";\
echo "Environment variable DOCKER_TAG not set to non-default value. Re-run with DOCKER_TAG=<something>. Try using 'make latest-tag' for latest develop docker image tag";\
exit 1; \
fi

Expand Down Expand Up @@ -403,6 +403,7 @@ kind-reset: kind-delete kind-cluster
.local/kind-kubeconfig:
mkdir -p $(CURDIR)/.local
kind get kubeconfig --name $(KIND_CLUSTER_NAME) > $(CURDIR)/.local/kind-kubeconfig
chmod 0600 $(CURDIR)/.local/kind-kubeconfig

# This guard is a fail-early way to save needing to debug nginz container not
# starting up in the second namespace of the kind cluster in some cases. Error
Expand All @@ -429,11 +430,11 @@ guard-inotify:

.PHONY: kind-integration-setup
kind-integration-setup: guard-inotify .local/kind-kubeconfig
ENABLE_KIND_VALUES="1" KUBECONFIG=$(CURDIR)/.local/kind-kubeconfig make kube-integration-setup
HELMFILE_ENV="kind" KUBECONFIG=$(CURDIR)/.local/kind-kubeconfig make kube-integration-setup

.PHONY: kind-integration-test
kind-integration-test: .local/kind-kubeconfig
ENABLE_KIND_VALUES="1" KUBECONFIG=$(CURDIR)/.local/kind-kubeconfig make kube-integration-test
HELMFILE_ENV="kind" KUBECONFIG=$(CURDIR)/.local/kind-kubeconfig make kube-integration-test

kind-integration-e2e: .local/kind-kubeconfig
cd services/brig && KUBECONFIG=$(CURDIR)/.local/kind-kubeconfig ./federation-tests.sh $(NAMESPACE)
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,8 +63,8 @@ It also contains
- **build**: Build scripts and Dockerfiles for some platforms
- **deploy**: (Work-in-progress) - how to run wire-server in an ephemeral, in-memory demo mode
- **doc**: Documentation
- **hack**: scripts and configuration for kubernetes helm chart development/releases mainly used by CI
- **charts**: kubernetes helm charts
- **hack**: scripts and configuration for kuberentes helm chart development/releases mainly used by CI
- **charts**: Kubernetes Helm charts. The charts are mirroed to S3 and can be used with `helm repo add wire https://s3-eu-west-1.amazonaws.com/public.wire.com/charts`. See the [Administrator's Guide](https://docs.wire.com) for more info.

## Architecture Overview

Expand Down
6 changes: 5 additions & 1 deletion changelog.d/mk-changelog.sh
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,14 @@ getPRNumber() {
for d in "$DIR"/*; do
if [[ ! -d "$d" ]]; then continue; fi

entries=("$d"/*[^~])

if [[ ${#entries[@]} -eq 0 ]]; then continue; fi

echo -n "## "
sed '$ a\' "$d/.title"
echo ""
for f in "$d"/*[^~]; do
for f in "${entries[@]}"; do
pr=$(getPRNumber $f)
sed -r '
# create a bullet point on the first line
Expand Down
2 changes: 2 additions & 0 deletions charts/fake-aws-s3/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ minio:
enabled: false
environment:
MINIO_BROWSER: "off"
defaultBucket:
name: dummy-bucket
buckets:
- name: dummy-bucket
purge: true
Expand Down
2 changes: 1 addition & 1 deletion charts/federator/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ resources:
memory: "512Mi"
cpu: "500m"
config:
logLevel: Debug
logLevel: Info
logFormat: JSON
optSettings:
# Defaults to using system CA store in the federator image for making
Expand Down
15 changes: 15 additions & 0 deletions charts/nginz/templates/conf/_deeplink.html.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{{- define "nginz_deeplink.html" }}
{{/* See https://docs.wire.com/how-to/associate/deeplink.html
(or search for "deeplink" on docs.wire.com)
for details on use of the deeplink*/}}
<html>
<head></head>
<body>
{{- if hasKey .Values.nginx_conf "deeplink" }}
<a href="wire://access/?config={{ .Values.nginx_conf.deeplink.endpoints.backendURL }}/deeplink.json">Click here for access</a>
{{- else }}
No Deep Link.
{{- end }}
</body>
</html>
{{- end }}
24 changes: 24 additions & 0 deletions charts/nginz/templates/conf/_deeplink.json.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
{{- define "nginz_deeplink.json" }}
{{- if hasKey .Values.nginx_conf "deeplink" }}
{{- with .Values.nginx_conf.deeplink }}
{{/* See https://docs.wire.com/how-to/associate/deeplink.html
(or search for "deeplink" on docs.wire.com)
for details on use of the deeplink*/}}
{
"endpoints" : {
{{- with .endpoints }}
"backendURL" : {{ .backendURL | quote }},
"backendWSURL": {{ .backendWSURL | quote }},
"blackListURL": {{ .blackListURL | quote }},
"teamsURL": {{ .teamsURL | quote }},
"accountsURL": {{ .accountsURL | quote }},
"websiteURL": {{ .websiteURL | quote }}
{{- end }}
},
"title" : {{ .title | quote }}
}
{{- end }}
{{- else }}
{}
{{- end }}
{{- end }}
19 changes: 19 additions & 0 deletions charts/nginz/templates/conf/_nginx.conf.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -344,6 +344,25 @@ http {
image/png png;
}
}

{{- if hasKey .Values.nginx_conf "deeplink" }}
location ~* ^/deeplink.(json|html)$ {
zauth off;
root /etc/wire/nginz/conf/;
types {
application/json json;
text/html html;
}
if ($request_method = 'OPTIONS') {
add_header 'Access-Control-Allow-Methods' "GET, OPTIONS";
add_header 'Access-Control-Allow-Headers' "$http_access_control_request_headers, DNT,X-Mx-ReqToken,Keep-Alive,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type";
add_header 'Content-Type' 'text/plain; charset=UTF-8';
add_header 'Content-Length' 0;
return 204;
}
more_set_headers 'Access-Control-Allow-Origin: $http_origin';
}
{{- end }}
}
}
{{- end }}
4 changes: 4 additions & 0 deletions charts/nginz/templates/configmap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ data:
{{- include "nginz_upstreams.txt" . | indent 4 }}
zwagger-config.js: |2
{{- include "nginz_zwagger-config.js" . | indent 4 }}
deeplink.json: |2
{{- include "nginz_deeplink.json" . | indent 4 }}
deeplink.html: |2
{{- include "nginz_deeplink.html" . | indent 4 }}
{{ (.Files.Glob "conf/static/*").AsConfig | indent 2 }}
kind: ConfigMap
metadata:
Expand Down
12 changes: 9 additions & 3 deletions charts/nginz/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,15 @@ nginx_conf:
worker_rlimit_nofile: 131072
worker_connections: 65536
swagger_root: /var/www/swagger
# deeplink:
# endpoints:
# backendURL: "https://prod-nginz-https.wire.com"
# backendWSURL: "https://prod-nginz-ssl.wire.com"
# blackListURL: "https://clientblacklist.wire.com/prod"
# teamsURL: "https://teams.wire.com"
# accountsURL: "https://accounts.wire.com"
# websiteURL: "https://wire.com"
# title: "Production"
disabled_paths:
- /conversations/last-events
- ~* ^/conversations/([^/]*)/knock
Expand Down Expand Up @@ -304,9 +313,6 @@ nginx_conf:
envs:
- all
doc: true
- path: ~* ^/list-conversations$
envs:
- all
- path: ~* ^/teams$
envs:
- all
Expand Down
2 changes: 1 addition & 1 deletion charts/sftd/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,4 @@ version: 0.0.42
# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to
# follow Semantic Versioning. They should reflect the version the application is using.
appVersion: 2.0.127
appVersion: 2.1.15
51 changes: 26 additions & 25 deletions charts/sftd/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,32 @@ able to reach the restund servers on their public IPs.
More exotic setups _are_ possible but are currently *not* officially supported. Please
contact us if you have different constraints.

### No public IP on default interface

Often on-prem or at certain cloud providers your nodes will not have directly routable public IP addresses
but are deployed in 1:1 NAT. This chart is able to auto-detect this scenario if your cloud providers adds
an `ExternalIP` field to your kubernetes node objects.

On on-prem you should set an `wire.com/external-ip` annotation on your kubernetes nodes so that sftd is aware
of its external IP when it gets scheduled on a node.

If you use our kubespray playbooks to bootstrap kubernetes, you simply have to
set the `external_ip` field in your `group_vars`
```yaml
# inventory/group_vars/k8s-cluster
node_annotations:
wire.com/external-ip: {{ external_ip }}
```
And the `external_ip` is set in the inventory per node:
```
node0 ansible_host=.... ip=... external_ip=aaa.xxx.yyy.zzz
```
If you are hosting Kubernetes through other means you can annotate your nodes manually:
```
$ kubectl annotate node $HOSTNAME wire.com/external-ip=$EXTERNAL_IP
```
## Rollout
Kubernetes will shut down pods and start new ones when rolling out a release. Any calls
Expand Down Expand Up @@ -193,31 +219,6 @@ helm install wire-prod charts/wire-server --set 'nodeSelector.wire\.com/role=sft
helm install wire-staging charts/wire-server --set 'nodeSelector.wire\.com/role=sftd-staging' ...other-flags
```

## No public IP on default interface
Often on-prem or at certain cloud providers your nodes will not have directly routable public IP addresses
but are deployed in 1:1 NAT. This chart is able to auto-detect this scenario if your cloud providers adds
an `ExternalIP` field to your kubernetes node objects.
On on-prem you should set an `wire.com/external-ip` annotation on your kubernetes nodes so that sftd is aware
of its external IP when it gets scheduled on a node.
If you use our kubespray playbooks to bootstrap kubernetes, you simply have to
set the `external_ip` field in your `group_vars`
```yaml
# inventory/group_vars/k8s-cluster
node_annotations:
wire.com/external-ip: {{ external_ip }}
```
And the `external_ip` is set in the inventory per node:
```
node0 ansible_host=.... ip=... external_ip=aaa.xxx.yyy.zzz
```

If you are hosting Kubernetes through other means you can annotate your nodes manually:
```
$ kubectl annotate node $HOSTNAME wire.com/external-ip=$EXTERNAL_IP
```

## Port conflicts and `hostNetwork`

Expand Down
2 changes: 1 addition & 1 deletion charts/sftd/templates/configmap-join-call.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ data:
location /healthz { return 204; }
location ~ ^/sfts/([a-z0-9\-]+)/(.*) {
proxy_pass http://$1.sftd.${POD_NAMESPACE}.svc.cluster.local:8585/$2;
proxy_pass http://$1.{{ include "sftd.fullname" . }}.${POD_NAMESPACE}.svc.cluster.local:8585/$2;
}
}
7 changes: 6 additions & 1 deletion charts/sftd/templates/statefulset.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,12 @@ spec:
else
ACCESS_ARGS="-A ${EXTERNAL_IP}"
fi
exec sftd -I "${POD_IP}" -M "${POD_IP}" ${ACCESS_ARGS} -u "https://{{ required "must specify host" .Values.host }}/sfts/${POD_NAME}"
exec sftd \
-I "${POD_IP}" \
-M "${POD_IP}" \
${ACCESS_ARGS} \
{{ if .Values.turnDiscoveryEnabled }}-T{{ end }} \
-u "https://{{ required "must specify host" .Values.host }}/sfts/${POD_NAME}"
ports:
- name: sft
containerPort: 8585
Expand Down
4 changes: 4 additions & 0 deletions charts/sftd/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -81,3 +81,7 @@ joinCall:
# Overrides the image tag whose default is the chart appVersion.
tag: "1.19.5"

# Allow SFT instances to choose/consider using a TURN server for themselves as a proxy when
# trying to establish a connection to clients
# DOCS: https://docs.wire.com/understand/sft.html#prerequisites
turnDiscoveryEnabled: false
2 changes: 1 addition & 1 deletion charts/team-settings/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ resources:
cpu: "1"
image:
repository: quay.io/wire/team-settings
tag: "4.0.0-v0.28.21-b92fca-2"
tag: "4.2.0-v0.28.28-1e2ef7"
service:
https:
externalPort: 443
Expand Down
2 changes: 1 addition & 1 deletion charts/webapp/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ resources:
cpu: "1"
image:
repository: quay.io/wire/webapp
tag: 2021-09-06-staging.3-v0.28.24-e6e306b
tag: "2021-10-28-federation-M1"
service:
https:
externalPort: 443
Expand Down
Loading

0 comments on commit d6b9490

Please sign in to comment.